In July of 2017 I updated the Stata Tutorial for version 15, and it
seemed a good time to convert it to a Stata Markdown script to be
processed with markstat
. In later years I updated it for versions
16, 17 and then 18. A big advantage of using markstat
is that it was
very easy to update the tutorial, and I could generate a PDF file via
LaTeX from the same script used for HTML.
Another development is that as of September 2022 all the Stata and R code from my website is available on GitHub, starting with the tutorial. The links below will take you to the source code and supporting files on GitHub, and the published HTML and PDF versions on my website.
If you are interested in reproducing the output, the following notes may be of interest.
If you are familiar with GitHub you can just clone the depository.
Alternatively, the following Stata commands will download all the files
needed to reproduce the PDF
local repo https://raw.githubusercontent.com/grodri/websrc/main/stata
foreach file in tutorial.stmd tutorial.bib icon18.png stata18.png ///
docs18.png _gpnupt.ado tweaks.tex {
copy `repo/`file' .
}
The main files are the markstat
script and bibliography. The introduction
uses three images, and the programming section an egen
extension. To match
exactly the style in the published PDF you also need tweaks.tex
as discussed
below.
In the source script I used the simple "one tab or four spaces" rule
to indent code that should be run through Stata. To list code that is
not to be run through Stata, for example to explain the syntax of a
while
loop, I used code fences as follows:
```
while condition {
... do something ...
}
```
The code is rendered in HTML as a preformatted block, and in LaTeX as a
verbatim
environment.
You will also note that I coded graphs using a caption-less figure, as in
![](scatter.png){.img-responsive .center-block}
The website uses the Bootstrap framework, and the two classes,
img-responsive
and center-block
ensure that the figure is centered
and displays well in devices of varying sizes. One exception is an image
used to highlight version 18, where I used an img
tag so it appears
only in the HTML version. Another is the screen capture of the Stata
interface, which I coded so it would appear in natural size in HTML and
using the full page width in LaTeX, by coding
<img src="stata18.png" class="img-responsive center-block"/>
\includegraphics[width=\linewidth]{stata18.png}
This takes advantage of the fact that Pandoc will pass along HTML and LaTeX code to the appropriate target format and ignore it otherwise.
I also collected all the bibliographic references in a BibTeX file, and
cited them all using the nocite
convention. The YAML block listed
further below references the bibliography file and has a literal
"nocite" field.
To publish the HTML to my website I split it into five files, one per section.
I used caption-less figures because they appear nicely centered in the
HTML output, but unfortunately LaTeX will add a figure number to the
otherwise empty caption. This is easily avoided however, using the LaTeX
command \usepackage[labelformat=empty]{caption}
, which adds the
caption
package with an option to supress labels. This is the only
required tweak, and is easily added as part of the YAML block, but I
decided to add a few more and collect them in a file called
tweaks.tex
. The YAML block used then reads
---
title: Stata Tutorial
author: Germán Rodríguez
date: June 2023
geometry: margin=1.25in
fontsize: 11pt
header-includes:
- \input{tweaks.tex}
bibliography: tutorial.bib
nocite: |
@*
---
If you list the tweaks.tex
file you will see that it uses
the caption
package to supress figure numbers as noted above,
the titling
package to modify rendering the title and author. I
wanted to use a large bold sans serif font and include a subtitle
and the URL of the tutorial. I also wanted to add my affiliation to
the author field.
the sectsty
package to modify the fonts used in all the section
titles (including subsections, subsubsections, paragraphs and
subparagraphs), using bold sans serif fonts of appropriate sizes.
the fancyvrb
package to modify verbatim
blocks so they match
exactly the stlog
environments used for Stata output.
These are just aesthetic changes that do not affect the content of the
tutorial, but allow you to reproduce exactly the published file by
simply typing markstat using stataTutorial, pdf bib
.
Something else you may toy with when generating a PDF document is page
breaks. Having looked at the document, however, I decided that most of
it was alright. I just added a pagebreak
to avoid a table being split
across pages, and tweaked the size of a couple of figures for a better fit.
Of course your pagebreaks may differ depending on fonts and other settings.
Note. The Stata Tutorial was first published in 2006 and targeted version 9, which makes the current version the 10th edition.