{smcl}
{* *! ver. 2.54 6May2023 by Mead Over, Center for Global Development: Section on programming UNDER CONSTRUCTION}{...}
{viewerjumpto "Introductions" "mead_favorites##introductions"}{...}
{viewerjumpto "Replicability: Documenting and logging your results" "mead_favorites##logfiles"}{...}
{* {viewerjumpto "Enhanced -collapse-" "mead_favorites##retainlbl"}}{...}
{viewerjumpto "Transformations, Functions and Expressions" "mead_favorites##expressions"}{...}
{viewerjumpto "Comparing data sets and variables" "mead_favorites##compare"}{...}
{viewerjumpto "Matrix utilities" "mead_favorites##matrices"}{...}
{viewerjumpto "Table making utilities" "mead_favorites##tables"}{...}
{viewerjumpto "Graphing Hints" "mead_favorites##graphs"}{...}
{viewerjumpto "Accessing CGD's Stata resources" "mead_favorites##cgdsite""}{...}
{viewerjumpto "Survival Analysis" "mead_favorites##survival"}{...}
{viewerjumpto "Power & sample size" "mead_favorites##sample"}{...}
{viewerjumpto "Predicted values from a non-linear regression" "mead_favorites##nonlinpredict"}{...}
{viewerjumpto "Categorical variables and labels" "mead_favorites##categorical"}{...}
{viewerjumpto "Categorical dependent variables" "mead_favorites##catdep"}{...}
{viewerjumpto "Workflow organization" "mead_favorites##workflow"}{...}
{viewerjumpto "Spatial analysis" "mead_favorites##spatial"}{...}
{viewerjumpto "Mediation analysis" "mead_favorites##mediation"}{...}
{viewerjumpto "Decomposition (e.g. Oaxaca)" "mead_favorites##decomposition"}{...}
{viewerjumpto "Plausible exogeneity" "mead_favorites##plausexog"}{...}
{viewerjumpto "Data Access Utilities" "mead_favorites##dataaccess"}{...}
{viewerjumpto "Programming an ADO file in Stata" "mead_favorites##makeado"}{...}
{viewerjumpto "User sites" "mead_favorites##usersites"}{...}
{viewerjumpto "Author of this help file" "mead_favorites##authors"}{...}
{hline}
help for {hi:Mead's Favorites}{right:{hi:Version 2.54 6May2023}}
{hline}
{marker introductions}{...}
{title:Introductions to Stata}
{p 4 4 2}
The best way to learn Stata is to take Stata's own web-based course
{browse "http://www.stata.com/netcourse/intro-nc101/":NC101}.
Stata has a compendium of other resources on their web site
{browse "http://www.stata.com/links/resources-for-learning-stata/":here}
{p 4 4 2}
For beginners, some of the best web resources are the following tutorials:
{p 8 4 2}
From Princeon {browse "http://data.princeton.edu/stata/":here}.{p_end}
{p 8 4 2}
From the London School of ecomonics
{browse "https://www.lse.ac.uk/Methodology/Software-tutorials/Stata-tutorials":here}.{p_end}
{p 8 4 2}
From UCLA
{browse "https://stats.oarc.ucla.edu/stata/modules/":here}.{p_end}
{p 8 4 2}
From Wolfgang Ludwig-Mayerhofer
{browse "https://wlm.userweb.mwn.de/Stata/":here}.{p_end}
{p 8 4 2}
From Boston College
{browse "http://fmwww.bc.edu/GStat/docs/StataIntro.pdf":"Introduction to Stata"}{p_end}
{p 8 4 2}
From the Univeristy of Pennsylvania
{browse "https://guides.library.upenn.edu/stat_packages/stata":here}.{p_end}
{p 8 4 2}
From Stanford
{browse "https://stataproject.blogspot.com/":here}{p_end}
{p 4 4 2}
Once you have started writing and de-bugging DO files, you might find useful the utility
{help do2screen}.
{p 4 4 2}
Chris Baum of Boston College has published a manual
on Stata programming available from Amazon or from Stata's online bookstore
{browse "https://www.stata.com/bookstore/introduction-stata-programming/":here}.
In addition to his introductory
{browse "http://fmwww.bc.edu/GStat/docs/StataIntro.pdf":157-slide set},
he has also posted
more advanced pedagogical PPTs on Stata programming listed below:
{p 8 4 2}
{browse "http://fmwww.bc.edu/GStat/docs/StataProg.pdf": Introduction to Programming in Stata}{p_end}
{p 8 4 2}
{browse "http://fmwww.bc.edu/GStat/docs/StataMLNL.pdf": Maximum Likelihood Estimation and Nonlinear Least Squares}{p_end}
{p 8 4 2}
{browse "http://fmwww.bc.edu/GStat/docs/StataMata.pdf": Stata's Matrix Programming Language, Mata}{p_end}
{p 8 4 2}
{browse "http://fmwww.bc.edu/GStat/docs/StataSimul.pdf": Monte Carlo Simulation in Stata}{p_end}
{p 8 4 2}
{browse "http://fmwww.bc.edu/EC-C/S2013/823/EC823.S2013.nn05.slides.pdf": Dynamic Panel Data Estimators in Stata}{p_end}
{p 8 4 2}
{browse "http://www.ncer.edu.au/events/documents/QUT15S1.slides.pdf": ADO-file programming in Stata (2015)}{p_end}
{marker logfiles}{...}
{title:Replicability: Documenting and logging your results}
{pstd}
One of the strengths of the Stata programming package is "replicability",
also sometimes known as "reproducibility". An essential component of
a strategy to assure reproducibility is the log file.
I have observed that many programmers write beautifully clear DO files
to document the process they used to arrive at research results,
but omit to make or publish the log file that would record how
Stata processed that DO file at the time it was executed.
{pstd}
Perhaps users would be more likely to make and distribute log files if log files
were easier to read. The utility {help lineno2} adds line numbers
to a Stata {help log} file, whether it is in {help smcl} or {cmd:log} format.
A {cmd:PDF} option is a shortcut to a PDF version of the SMCL file with line numbers.
With line nmbers added, researchers on a Zoom call together can more easily
find the code or results to discuss.
{pstd}
For a more general discussion of the importance of reproducibility, see this Stata blog by
{browse "https://blog.stata.com/2020/06/04/revealed-preference-stata-for-reproducible-research/":Enrique Pinzon}.
{pstd}
More recently the Harvard Data Science Review has published an open article by Lars Vilhuber on
"Reproducibility and Replicability in Economics" available
{browse "https://hdsr.mitpress.mit.edu/pub/fgpmpj1l/release/3":here}
and in PowerPoint
{browse "https://www.youtube.com/watch?v=1hRbvKPeem4":here}.
{pstd}
Stata gives an overview of its new and old capabilities for assuring reproduceability and automated reporting:
{browse "https://www.stata.com/features/overview/truly-reproducible-reporting/":here}.
{pstd}
Many user-contributed Stata programs are designed to help users beautify Stata output and results.
A useful tool for adding HTML or LaTex markup to a log file is {help log2markup},
by Niels Henrik Bruun. Also see the links in this help file under {help mead_favorites##tables:Table Making Utilities}.
{marker repshow}{...}
{p 4 4 2}
If you are using the {help replace} command to change selected values of a variable,
perhaps correcting errors in the data or specifying a specific value for a specific observation,
the danger exists that you will later forget exactly what the value used to be
and how you have changed it. The command {help repshow} is an enhancement of the {help replace} command
which will document in your results window, and thus in your log file, the value of
observations before and after they are replaced.
{marker genl}{...}
{p 4 4 2}
If you are in too much of a rush to label variables you are creating, the
ancient user-written command {help genl} will do it for you, thus helping
you document your logic. Or try the more recent and versatile {help labgen2},
which is part of the download package {search labutil2}. Fernando Rios-Avila
includes his own version of the same program as {search fgen} in his 2021
utility {search f_able}.
{* {marker retainlbl}}{...}
{* {title:Enhancements to Stata's {help collapse} command}} {...}
{* } {...}
{* {p 4 4 2}} {...}
{* Stata's {help collapse} command, replaces the variable labels with new labels } {...}
{* it constructs itself to denote the summary statistic in the -collapse-.} {...}
{* If you would prefer the variables retain their pre-collapse labels,} {...}
{* you can save and restore the labels using {help retainlbl}.} {...}
{* The program also works with {help reshape}.} {...}
{marker expressions}{...}
{title:Transformations, functions & expressions}
{p 4 4 2}
Suppose you would like to explore various transformations of variables
{cmd: x, y} and {cmd: z} and the relationships among these transformed
variables. The obvious approach is to use Stata's {help generate}
command to generate the various transformations and then explore the
transformed variables using various graphics or descriptive statistic
commands. The result is a dataset littered with all of these transformed
variables. An alternative is to use the clever program(s) written by
Jeroen Weesie entitled {help expr} and {help exprcmd}.
{p 4 4 2}
Beginning in Stata 11, Stata's {help fvlist:factor variables} provide another alternative
which works for powers of a variable and for interaction variables. Thus
instead of generating {cmd:x^2}, {cmd:x^3} or {cmd:x*y}, one can instead specify them as
{cmd:c.x#c.x}, {cmd:c.x#c.x#c.x} or as {cmd:c.x#c.y}. By using two #'s instead of one like this:
{cmd:c.x##c.y}, one can now specify the individual x and y variables as well as the product
between them.
{p 4 4 2}
See {help fvvarlist} for documentation and examples. For programing purposes,
the commmand {help fvrevar} converts a factor variable expression into a
variable list that can be used by commands that do not otherwise support {help fvlist:factor variables} or
time-series-operated variables. This
{browse "https://www.stata.com/support/faqs/programming/factor-variable-support/index.html":FAQ}
from Stata Corp. shows how to use
factor variable syntax in your own programs.
{p 4 4 2}
An important advantage of using Stata's {help fvlist:factor variables} to represent non-linear functions
of variables on the right-hand-side of estimation commands is that Stata's {help margins}
is able to compute the proper impacts or derivatives of a right-hand-side variable
that appears in more than one factor-variable term. So to regress {it:y}
on a quadratic function of {it:x} and then estimate it's derivative at its mean value,
one need only type:
{cmd:regress y c.x##c.x }
{cmd:margins , dydx(x)}
{p 4 4 2}
However {help fvlist:factor variables} are unable to represent transformations
other than simple polynomials. To compute {help margins} of variables transformed
in ways which cannot be represented by factor variables, see Fernando Rios's
{search f_able} which is documented in his 2021 SJ article.
{browse "https://www.stata-journal.com/article.html?article=st0628":"Estimation of marginal effects for models with alternative variable transformations"}.
His {search f_able} program makes clever use of Stata's {help _ms_dydx_parse}
command to augment the estimated coefficient and variance-covariance matrices
returned by the estimation command so that they can be used by {help margins}.
{p 4 4 2}
Stata's built in functions are described in {help functions}. But the menu of
Stata's built-in transformations is substantially expanded by a large set
of {help egen:"egen" commands}. The user community has added to Stata's built-in
egen commands. Perhaps the two largest packages of user-written {cmd:egen}
commands are {help egenodd} and {help egenmore}, by Nick Cox and others and updated in 2019.
Since few {help egen:"egen" commands} have a {opt replace} option,
the user-written {help ereplace} may come in handy.
{p 4 4 2}
Those working with DHS data (from the Demographic and Health Survey) will appreciate
Ian Timaeus' utility,
{stata `"view net describe convertCMC, from("https://raw.githubusercontent.com/bugbunny/convertCMC/master")"':convertCMC},
which converts so-called CMC (Century Month Calendar) dates to Stata encoded dates.
An {help egen:"egen" command} to compute poverty measures is
{stata `"view net describe egen_inequal, from("http://fmwww.bc.edu/RePEc/bocode/e")"':egen_inequal}.
Accelerated versions of some {cmd:egen} commands are available in the packages
{help ftools} and {help gtools}.
See Kit Baum's PPT on
{browse "http://www.ncer.edu.au/events/documents/QUT15S1.slides.pdf":ADO-file programming in Stata}
for an introduction into writing your own {cmd:egen} utility.
{p 4 4 2}
While Stata's estimation commands are brilliant at allowing the user to
specify weights, Stata's approach to the task of constructing a weighted
average is to use the command {help ameans}, which approaches the task
as an estimation problem, designed to produce a scalar result.
Alternatively, for simple weighted means, one can use {help collapse},
but that approach has the disadvantage of replacing the data in memory with the collapsed data.
To construct a variable defined as the weighted or unweighted arithmetic,
geometric or harmonic mean by a grouping variable, it is often more convenient
to use the new {bf:egen} command, {help _gwmean:wmean}, by Gueorgui I. Kolev.
See the Statalist discussions
{browse "https://www.statalist.org/forums/forum/general-stata-discussion/general/1596529-arithmetic-geometric-and-harmonic-totals-and-arithmetic-geometric-and-harmonic-weighted-means#post1596529":here}
and
{browse "https://www.statalist.org/forums/forum/general-stata-discussion/general/1599985-the-weights-in-summarize-behave-not-as-advertised-in-the-manual-and-can-somebody-explain-frequency-and-analytic-weights-in-this-context#post1599985":here}.
{title:Missing data}
{p 4 4 2}
See the October, 2008 FAQ from UCLA Academic Technology Services entitled
{browse "https://stats.oarc.ucla.edu/stata/faq/how-can-i-see-the-number-of-missing-values-and-patterns-of-missing-values-in-my-data-file/":"How can I see the number of missing values & patterns in my data file?"}.
Also see Cox's {search missings} and {search missingplot}.
{p 4 4 2}
For panel data, the following command may be useful for finding "holes"
in the data and filling them with imputed values: {help findholes}
{p 4 4 2}
The command {help regmsng} allows the user to set a tolerance, {opt nmiss(#)},
for missing values for the right-hand-side variables in an ordinary linear regression.
For example, with {opt nmiss(2)}, variables with 1 or 2 missing values are retained,
while those with 3 or more missing values are dropped.
The missing values of retained variables are imputed with the user's choice of
several imputation techniques. The standard errors are not adjusted for the
imputaton.
{p 4 4 2}
More rigorous approaches to the imputation of missing values are available in
{help hotdeck}, {help ice} and {help mim}. This 2007 FAQ on imputing values
in a panel data set provides an introduction to {help mim}
{browse "http://www.ats.ucla.edu/stat/stata/faq/mi_longitudinal.htm"}
{p 4 4 2}
Beginning with Stata 11, Stata is providing a suite of programs dedicated to multiple
imputation of missing values. See {help mi}
{marker compare}
{title:Comparing data files and variables}
{p 4 4 2}
Like Stata's {help compare}, {help compare2} reports the differences and similarities
between two variables with different names located in the same dataset.
However, unlike compare, {help compare2} also returns stored results in Stata's {help return} space
for subsequent use by the programmer. With the added {opt reldif} option,
{help compare2} presents the summary statistics of the relative difference
between the two variables as computed by the Stata function {help reldif}.
See help for {help reldif}. For ease of use, {help compare2} has a companion dialog.
{p 4 4 2}
STATA's {help cf} command is a powerful tool for comparing
variables in a "master" data set in memory to
identically named variables in a saved data set on disk.
But {help cf} fails when the two data sets have
different numbers of observations
or when the only difference between two data sets is the way they are sorted.
{help cf2} is a wrapper for Stata's {help cf} which first sorts
the two datasets and then compares identically named variables
on only those observations that match according to the sorting variables.
For ease of use, {help cf2} has a companion dialog.
{p 4 4 2}
The commands {help cf} and {help cf2} report mismatches between variables
with the same name located in different data sets.
When the variables being compared are numerical and are both in the current data set,
{help compare} or {help compare2} provides a more complete analysis of differences.
To obtain the more detailed comparison in the style of {help compare}
for variables with the same names located in different datasets,
try the program {help compuse} which also pops up a graph of one version
of each variable against the version in the other dataset.
{p 4 4 2}
Both {help cf2} and {help compuse} have the required option {opt sortvars(varlist)}
to specify the variable or variables which
uniquely sort the two compared datasets.
Stata refers to such a set of variables as an "ID" variable,
while others refer to them as "key" variables.
Prior to executing either {help cf2} or {help compuse}, the user should confirm
that proposed sort variables do indeed uniquely identify the observations in both data
sets. Stata's {help isid} serves this purpose for both the master and, with the
{opt using} option, for the -using- dataset. Also see {help dta_equal}
and the community contributed commands {help assertky} and {help findunique}.
{marker matrices}
{title:Matrix utilities}
{p 4 4 2}
Two kinds of matrices are available in Stata. From time immemorial, Stata has its own
matrices and {help matrix:matrix utilities}. Since version 11 of Stata, the program
includes a high-powered, super fast, compiled, matrix-based programming language called
Mata.
{p 4 4 2}
While a Stata matrix can be read by a Mata program and vice-versa, the two types of matrices
are quite distinct. For example, the elements of a Stata matrix can only be numeric,
while a Mata matrix can consist either entirely of numeric or entirely of string elements.
A second distinction is that the rows and columns of a Stata matrix have
alphanumeric (i.e. string) names and one can reference an element of a Stata matrix
by those row and column names. On the other hand, the rows and columns of a Mata matrix
only have numeric indices, not alphanumeric names and therefore must be referenced
by those numeric row and column indices.
{p 4 4 2}
For Stata matrices, members of the Stata community have contributed programs that
complement the official Stata matrix utilities. A few of these collections are:
Jerome Weesie's {search dm49:matfunc and varfunc} collections,
Nick Cox's collections {search dm69:from STB-50} and {search dm79:from STB-56},
Paul Millar's {help matsort:matsort} program and
Neils Bruun's {help matrixtools}. Ben Jann's {help moremata}.
{p 4 4 2}
Of these packages, {help matrixtools:matrixtools} works with Stata matrices
using a Mata library, {stata which lmatrixtools.mata:lmatrixtools}, while
{help moremata} is a broad collection of useful utilities for Mata matrices.
{p 4 4 2}
In Stata it is sometimes useful to use a matrix as a lookup table.
See Cox's
{browse "https://journals.sagepub.com/doi/pdf/10.1177/1536867X1201200413": Matrices as look-up tables}
for a general discussion and tips.
To assist with extracting an item or multiple items from either a Stata or a Mata matrix,
try my matrix lookup programs {help mlu:mlu} and {help mluwild:mluwild}.
{marker tables}
{title:Table making utilities}
{p 4 4 2}
Newly arrived is Stata 17's impressive set of table-making capabilities described
{browse "https://www.stata.com/new-in-stata/tables/":here}.
For those without access to Stata 17, the existing range of free community-contributed
programs described below provides wide functionality at zero cost.
{p 4 4 2} Also consider Stata's {help tabdisp} and {help table} commands. The former
is for making simple tables of individual data values, where no computations are required.
{help tabdisp} has the power to format information into "supercolumns" composed of
individual columns or "superrows" of individual rows. {help table} is a newer command which
inherits the "superrow" and "supercolumn" capability, but also performs computations. Scroll
down to the bottom of {help table}'s help file for a link to a video showing how it works.
{p 4 4 2}
Most table-making commnads in Stata display results for categorical variables,
naming the rows and columns with STATA's "value labels". Utilities for
documenting and extracting these category labels are discussed
{help mead_favorites##categorical:below}.
{p 4 4 2}
STATA's {help tabstat} command is useful and flexible. It's two most glaring deficiencies
are its unattractive column labels (which consist only of the variable name or statistic name)
and the fact that its {cmd: format} option can produce column headings that are not
aligned with the columns of statistics. My hack of STATA's {help tabstat}, {cmd: tabstat2},
addresses these two problems. First it produces more attractive output when the {cmd: format}
option is selected by aligning the columns and the variable names. And {help tabstat2} also
has two additional options, {cmd: noheader} and {cmd: describe} which optionally suppress the standard
heading and add a listing of the variable names keyed to their variable labels. Sorry, no help file yet.
{p 4 4 2}
Ian Watson has contributed a powerful general purpose table-making command called
{help tabout}, which can be installed from SSC using the command:
{cmd:ssc install tabout, replace}. In October, 2022 a newer version
is available in beta from his web site
{browse "http://www.stata.com/netcourse/intro-nc101/":here}.
{p 4 4 2}
{ul:To create tables of estimation results}, the basic procedure has been to
collect the {help estimates} using the {help estimates store} command
and then display them in a table using {help estimates table}.
In version 17, Stata has introduced the new command {help etable},
which makes Stata's {help estimates table} command essentially obsolete.
It seems to me that {help etable} also renders obsolete all
the community contributed programs in this class, including those I discuss
in the remaining paragraphs in this section.
Stata's Jeff Pitlabo provided an easy-to-follow introduction to {help etable} in a
{browse "https://www.stata.com/meeting/northern-european21/slides/Northern_Europe21_Pitblado.pdf":PowerPoint}
delivered at the mid-2021 Stata Users Group Meeting in Europe.
{p 4 4 2}
For years multiple community members have created programs that provide much more user-control
of the structure, content and formatting of tables of estimation results
than were available in {help estimates table}.
One of the first such commands is {help outreg} by Luke Gallup
{browse "https://www.stata.com/products/stb/journals/stb46.pdf":STB 46, Nov. 1998, pp. 28}.
In 2012, Gallup completely rewrote the program in Mata, now available
{stata `"view net describe sg97_5,from("http://www.stata-journal.com/software/sj12-4")"':here}.
Using its companion utility program {help frmttable},
{help outreg} produces tables for either MS Word or Tex, but not in Excel format.
For that purpose, available user-written commands include
{help xml_tab} by Misha Lokshin, of the World Bank,
{help outreg2} by Roy Wada and {help esttab} by Ben Jann,
who gives examples on the Excel page of his
{browse "http://repec.sowi.unibe.ch/stata/estout/esttab.html#h-10":web site}.
{p 4 4 2}
By invoking {help esttab}'s {opt using myfilename.rtf} option, the user can
export a table to MS Word. The MS Word page of Jann's extensive companion
{browse "http://repec.sowi.unibe.ch/stata/estout/esttab.html#h-11":web site}
suggests how the user can further modify {help esttab}'s RTF output with
{browse "https://metacpan.org/dist/RTF-Writer/view/lib/RTF/Cookbook.pod":RTF formatting commands}.
Note that, although both {help esttab} and {help estout} program have {opt using filename} options,
only {help esttab} will produce a properly formatted {cmd:RTF}, {cmd:CSV} or {cmd:TeX} file.
To customize the font or page orientation of the {cmd:RTF} file produced by {help esttab},
Ben Jann recommends (personal communication)
using the {help estout##substitute:substitute()} option
to add approriate commands to the RTF code at the start of the document.
Since {help esttab} starts its {cmd:RTF} documents with the code string {cmd:"\deflang1033\plain\fs24"},
this is a good place to add additional formatting instructions.
An example of a substitution that accommodates a wide table in landscape orientation on US-style A4 printer paper
is to use {help estout}'s option, {opt model:width(#)}, where {it:#} might be 8 or even smaller, together with
the option {opt sub:stitute()} specified as follows:
{p 4 4 2}
{cmd:. esttab using text.rtf, substitute("\deflang1033\plain\fs24" "\landscape\paperw15840\paperh12240\deflang1033\plain\fs24")}
{p 4 4 2}
To print onto European-style A5 paper, the {opt sub:stitute()} option would instead be:
{p 4 4 2}
{cmd:. esttab using text.rtf, substitute("\deflang1033\plain\fs24" "\landscape\paperw16834\paperh11909\deflang1033\plain\fs24")}
{p 4 4 2}
To also reduce the font sizes used in the {cmd:RTF} document, change
the font size twice for both table contants and table notes.
For the content, change the font size from 12-point to 10-point by substituting "fs20" for "fs24".
Then change the font size in the notes by substituting "fs16" for "fs20" as follows:
{p 4 4 2}
{cmd:. esttab using text.rtf, substitute("\deflang1033\plain\fs24" "\landscape\paperw15840\paperh12240\deflang1033\plain\fs20" "\pard\ql\fs20" "\pard\ql\fs16" )}
{p 4 4 2}
To also change the margins, add to the landscape specification as follows:
{p 4 4 2}
{cmd:. esttab using text.rtf, substitute("\deflang1033\plain\fs24" "\landscape\paperw15840\paperh12240\margl1440\margr1440\margt1800\margb1800\deflang1033\plain\fs20" )}
{p 4 4 2}
While properly formatted tables of coefficients are the primary means of communicating
esimated regression coefficients, plotting those coefficients and their confidence
intervals helps your audience quickly grasp your results. Ben Jann's command {help coefplot}
is an extraoridinaarliy useful program to plot coefficients and their confidence intervals.
This 2017 PPT by Dawn Koffman at Princeton's Office of Population Research gives a superb
{browse "https://opr.princeton.edu/workshops/Downloads/2017May_StataVisualizingRegressionModelsCoefplotKoffman.pdf":introduction to using -coefplot-}.
Another way to access manipulate regression coefficients is available in Roger Newson's {help parmest}
program, described {help mead_favorites##nonlinpredict:below}.
{p 4 4 2}
A standard component of many journal articles is a table of descriptive statistics.
Such a table can usefully include not only the descriptive statistics on each independent variable,
but also the bivariate correlation coefficient between each independent variable and the dependent
variable. The program {help bivariate} makes a table of correlation coefficients and variance
inflation factors. Adding the {cmd: tabstat} option to {help bivariate} directs it to
also produce a table of means and standard deviations produced by Stata's {help tabstat}
command on the observations that would be included in the multiple regression.
{p 4 4 2}
For an interesting approach to visualizing a table with a Stata graph, see the discussion of
{help mead_favorites##tabplot:tabplot below}.
{p 4 4 2}
Version 15 of Stata introduced official utilities to export directly to MS Word or MS Excel.
See Stata's documentation on {help putdocx} and {help putpdf}.
Other recently contributed, user-written table-making commands include {search sumtable,net:sumtable},
{stata ssc describe table1:table1}, {stata ssc describe partchart:partchart}.
A recent ambitious entry in this category is {stata ssc describe asdoc:asdoc} by Attaullah Shah.
At this writing (4Jul2021), {help asdoc} competes neck-and-neck with Ben Jann's
{help estout} and Roy Wada's {help outreg2}. To check on this competition,
type {stata ssc whatshot, n(15):ssc whatshot, n(15)} at Stata's command prompt.
While Jann's programs export in the non-proprietary RTF and CSV formats,
which are compatible respectively with MS Word and MS Excel, Shah's {help asdoc}
exports directly to MS Word's .doc format, but does not export easily to Excel.
Shah has now released a commercial aftermarket version of his program called {cmd:asdocx}
which is sold on a dedicated
{browse "https://fintechprofessor.com/asdocx/":web site}
and has several additional features,
including export to .xlsx files.
{marker graphs}{...}
{title:Graphing hints and favorites}
{p 4 4 2}
The overall look of a Stata graph is controlled by the "scheme" the user employs.
See {help schemes} for an introduction and {help scheme_option} for the syntax
to alter the scheme of any {help graph} command. From inside your Stata,
type {stata graph query, schemes} to see a list of the schemes installed on your
instance of Stata.
{p 4 4 2}
The prolific Stata program contributor, Ben Jann, has written a powerful
commmand to facilitate the user's customization of their own personal
(or institutional) Stata graphics style. This program, called {help grstyle}
is documented in two Stata Journal articles
{browse "https://www.stata-journal.com/article.html?article=gr0073":here} (gated until 2021) and
{browse "https://www.stata-journal.com/article.html?article=gr0073_1":here} (gated until 2022) and
explained in a Powerpoint presentation posted
{browse "https://www.stata.com/meeting/switzerland18/slides/switzerland18_Jann.pdf":here}
(ungated). He has also published companion programs {help palettes},
{help colorpalette}, {help symbolpalette} and {help line palette}.
His Stata Journal article entitled
{browse "https://www.stata-journal.com/article.html?article=gr0075":Color palettes for Stata graphics}
(gated until 2022) is the best available guide to user accessible portfolios of palettes.
{p 4 4 2}
In his article, "Color palettes for Stata grphics", Jann reviews many different schemes.
The scheme {stata search lean:lean} from the Stata Journal conserves on unnecessary "ink".
Two user-written branded color schemes with interesting color palettes are the
{stata search scheme-mrc:MRC scheme} and the {stata search scheme-tfl:TFL scheme}.
Color schemes popular at CGD include Stata's built-in scheme,
{help scheme_s1rcolor:s1rcolor}, which uses a black background and
{stata search scheme-s2clr_on_white:s2clr_on_white}, produced here at CGD.
As its name implies, the {stata search scheme-s2clr_on_white:s2clr_on_white} scheme
replaces the light-blue backgrounds with white backgrounds. Install
{stata search scheme-s2clr_on_white:s2clr_on_white} from CGD's Stata repository
{stata "view net describe scheme_s2clr_on_white, from(http://digital.cgdev.org/doc/stata/MO/Misc)":here}.
New abstemious schemes to consider are: {stata search scheme-plotplain:here}.
{p 4 4 2}
Stata's {help graph} command is powerful but a user often struggles to create
a graph with the desired appearance and labeling. It's often helpful to construct
an initial version of the desired graph interactively using the dropdown dialogue
box under the graphs menu. When a dialogue box is "submit"ted, the code to generate the
same result is typed into Stata's results window. By creating a {help do} file,
and saving this code into that file, the user has the basics of the desired graph.
Then by elaborating on the saved code, saving and executing the DO file, and then
altering it again, the user can achieve virtually any desired graph.
{p 4 4 2}
The box-and-whisker graph constructed by {help graph_box} is a powerful descriptive tool.
However, STATA does not make it easy to learn the numeric values of the "adjacent values"
at which the command constructs the "whiskers". The command {help sumadj} is the answer
and also offers a log scale option.
{p 4 4 2}
If the range of a variable you are graphing is in the millions or billions,
the graph will be ugly unless you rescale the variable. Of course you can -gen- a new
variable for your graph. An alternative is to use the {help rescale} command,
which creates temporary versions of your variables which have been rescaled
by a power of ten. Instead of rescaling the variables, one can simply rescale
the variable labels. Utilities to facilitate the construction of "nice" axis labels
include {help nicelabels} and {help mylabel}.
Nick Cox's {help mylabels} can add the prefix "$" or the suffix "%" to the axis labels.
To label the axis of a log-scaled variable consider the program {help niceloglabels}.
{p 4 4 2}
Jeroem Weesie's utility {search graphf} assembles the predicted score from
the variables in a regression which all vary together, such as x, x^2 and x^3
or the various variables of which a spline is composed. My version {help graphf2}
additionally returns the functions used to construct the score so that the user
can construct it him/herself. Stata's official {help margins} command and the accompanying
{help marginsplot} command subsume some of this functionality for non-linear functions
expressed in Stata's {help factor_variable}s syntax.
{p 4 4 2}
There are two possible approaches to constructing a graph with multiple panels.
The most efficient way is to use a single {help graph} command, adding the {help by option}.
But not all graphs will accept a {help by option}. And sometimes, even for commands
that have a by option, a multiple-step alternative is easier:
execute and save a separate graph command for each panel and then assemble them into a
single multi-panel image using the command {help graph combine}.
{p 4 4 2}
A multi-panel graph is more legible if all panels share the same legend.
When the multiple panels are constructed with a single {help graph} command and
the {help by option}, controlling the placement and contents of the legend is tricky,
requiring that some aspects of the legend be specified {cmd:inside} the {help by option}
and other aspects {cmd:outside} the {help by option}. See the help file discussion
{help legend_options##_use_of_legends_with_by:here}.
{p 4 4 2}
When a multi-panel graph is constructed by using {help graph combine} to assemble
several saved graphs into a single image, a useful utility for assuring there is only
a single legend is the utility {help grc1leg} written by Vince Wiggins of Stata Corp,
which can be obtained
{stata "view net describe grc1leg, from(http://www.stata.com/users/vwiggins)":here}.
But also see my own improved version, {help grc1leg2}.
{p 4 4 2}
To graph an arbitrary function in STATA, without necessarily having data, try the following command:
{p 4 4 2}
{stata twoway (function y=log(x)*sin(x)) (function z=x*cos(x)) }
{p 4 4 2}
See {help twoway_function} or page 49ff of
{browse "http://fmwww.bc.edu/GStat/docs/StataIntro.pdf":Chris Baum's 2011 Power Point}
{p 4 4 2}
Isoclines and level sets of various sorts, such as isoquants, production possibility frontiers,
isocost curves or temperature or altitude contours, can be plotted
in recent versions of Stata using {help twoway contour}.
A contour mapping program that works in version 10.2 of Stata is
{help plotmatrix} written by Adrian Mander. Get his most recent version from
{net "describe plotmatrix, from(http://fmwww.bc.edu/repec/bocode/p)" :plotmatrix}.
Sergiy Radyakin of the World Bank has also written a contour plotting program called
{cmd:matrixplot} which is described in this Powerpoint presentation
"{browse "http://www.stata.com/meeting/dcconf09/dc09_radyakin.pdf":{it:Implementing custom graphics in Stata}}"
and demonstrated in the figures and videos available
in a zip file: {browse "http://www.stata.com/meeting/dcconf09/dc09_radyakin.zip":dc09_radyakin.zip}.
Unfortunately Sergiy never distributed the command itself.
A different approach is available with my {help isoplot}.
{p 4 4 2}
A nice utility for constructing histograms with an overlaid normal distribution is {search historaj, all:historaj}.
My utility
{stata "view net describe hist_overlay, from(http://digital.cgdev.org/doc/stata/MO/Misc)":hist_overlay}
overlays one histogram with a differently shaded second histogram so that both
histograms can be distinguished. (Overlaying more than two histograms will rarely be useful.)
{p 4 4 2}
To display all, or a subset of, the graphs in memory or in the current folder, try {help displaygph}.
{p 4 4 2}
Stata's example of how to use their {help graph combine} command is a
scatter plot enhanced by the addition of marginal distributions to the x- and
y axes. It's gorgeous to look at but tedious to program. My utility
{help superscatter} automates this process and provides additional options
and enhancements. Find it
{stata "view net describe superscatter, from(http://digital.cgdev.org/doc/stata/MO/Misc)":here}.
{p 4 4 2}
Suppose you need to make a graph that compares two rankings of the same set of
objects. Nick Cox's hints in his 2009 SJ article
{browse "https://www.stata-journal.com/sjpdf.html?articlenum=gr0041":"Paired, parallel, or profile plots for changes, correlations, and other comparisons"}
are just what you need.
My program
{stata "view net describe rankplot, from(http://digital.cgdev.org/doc/stata/MO/Misc)":rankplot}
offers an enhanced implementation of one of his algorithms.
{p 4 4 2}
The visualization known as a "chord plot" seems to be gaining popularity.
Prefiguring this interest was a
{browse "https://www.statalist.org/forums/forum/general-stata-discussion/general/5236-polar-plots":discussion}
on the Statlist back in 2014. In the last 2014 comment on that thread,
Joe Canner promised a command called
{stata "view net describe polar, from(http://fmwww.bc.edu/RePEc/bocode/p)":polar}
which is now available at SSC. Soon thereafter a discussion on
{browse "https://stats.stackexchange.com/questions/163295/linear-equivalent-to-chord-diagram":Stack Exchange}
suggests that the chord diagram is a member of a
class of non-linear flow diagrams called "Kriskograms"
and asks whether there are linear equivalents to chord diagrams.
On the latter question, a user cites the 2009 SJ article by Nick Cox
{browse "https://www.stata-journal.com/sjpdf.html?articlenum=gr0041":{it}op cit{sf}}.
(See also {help rankplot}.)
{browse "https://asjadnaqvi.medium.com/":Asjad Naqvi} on the platform
{browse "https://medium.com/":Medium}
presents a series of Stata tutorials. I particularly like his
{browse "https://medium.com/the-stata-guide/stata-graphs-polar-radial-plots-c19e705b56aa":tutorial}
on using Stata's graph engine with polar coordinates.
{p 4 4 2}
Stata has facilitated adding special characters to graph text such as titles,
marker symbols, etc. For example, the "degree sign" can be inserted in a graph legend
as "45{°ree} line" and appears like this "45{c 176} line" and the symbol
denoting "function of" can be inserted like this "{&function}{c -(}it:(x){c )-}"
and appears like this "{c 131}{it:(x)}". Many symbols are available including
all the small and capital Greek letters and letters from non-Roman alphabets.
Nick Cox's tip from 2004 is still useful and can be found
{browse "https://www.stata-journal.com/sjpdf.html?articlenum=dm0006":here} (ungated).
More recently, Stata has added a help file explicitly for this purpose. See
{help graph_text:help graph_text}. For background, see {help smcl##ascii:help smcl}.
{p 4 4 2}
{marker tabplot}
When working with two categorical variables, a scatter plot does not accurately
convey their joint distribution. {help scatter}'s option {cmd:jitter} can help.
A more useful solution is Nick Cox's program
{help tabplot} which is available from the Stata journal repository
{search tabplot:here}
and from SSC {stata ssc describe tabplot:here}.
Also see his 2004 article on graphing categorical data
{browse "https://www.stata-journal.com/article.html?article=gr0004":"Speaking Stata: Graphing categorical and compositional data"}.
{p 4 4 2}
The "box-and-whisker" plot is a useful non-parametric graph of a distribution.
See {help graph box} for details and for the definitions of the key elements
of the graph including the "adjacent values". Unfortunately, Stata does not
include a command to compute the numerical values of the adjacent values.
I have written the command {help sumadj} to add those computations to the
usual output of stata's {help summarize:summarize, detail} command. With the
{cmd:graph} option, the command will also draw the box-and-whisker plot
and label the adjacent values. To depict a highly skewed distribution,
such as those typical of expenditure data,
add the option {cmd:ylog} which rescales the
y-axis to a log scale, an option that is not available in current version of
Stata's {help graph box} command.
(A more general approach to depicting non-symmetric distributions is embodied
in Ben Jann's command {stata ssc describe robbox:robbox} (for "robust box").)
Other utilities for calculating "adjacent values" include Nick Cox's
(help adjacent) and the sub-commands {cmd:adjl(varname)} and {cmd:adju(varname)}
in the package of additional {help egen} commands {help egenmore}.
{p 4 4 2}
Stata graphs can be animated! See Chuck Huber's 2014 blog
{browse "https://blog.stata.com/2014/03/24/how-to-create-animated-graphics-using-stata/":"How to create animated graphics using Stata"}.
and Di Liu's blog
{browse "https://blog.stata.com/2018/03/06/how-to-create-animated-graphics-to-illustrate-spatial-spillover-effects/":"How to create animated graphics to illustrate spatial spillover effects"}.
Here's a 2019 blog on creating a GIF file from Stata graphs by Laura Whiting.
{browse "https://www.techtips.surveydesign.com.au/post/visualisation-creating-a-moving-stata-graphic":"Visualisation - Creating a Moving Stata Graphic"}.
{p 4 4 2}
Many publishers insist on reconstructing an author's graphs from the original data. They ask the author to provide the data.
For simple graphs this is not a problem. But some Stata graphics commands such as {help twoway qfit} create their own "data".
In this and other situations it is useful to be able to export the data that resides in a {help gph files:Stata gph file} file to a spreadsheet program.
Two programs that perform some of the work towards this end are my own {help gph2xl} and Ben Daniels' {cmd:gph2csv}.
{cmd:gph2csv} can be installed from GitHub by clicking
{stata "github search gph2csv":here}.{p_end}
{marker survival}{...}
{title:Survival analysis}
{p 4 4 2}
See Stephen Jenkin's course material at:
{browse "https://www.iser.essex.ac.uk/resources/survival-analysis-with-stata"}
{p 4 4 2}
In addition to the extensive set of training materials listed when you {stata search survival},
see Coviello et al's 2015 SJ article,
{browse "https://journals.sagepub.com/doi/pdf/10.1177/1536867X1501500111":"Estimating net survival using a life-table approach"},
which documents the command {stata ssc describe stnet:stnet}.
The authors have updated their command several times, most recently in 2020.
Also check out Hills et al's 2014 article
{browse "https://www.stata-journal.com/article.html?article=st0330":"strel2: A command for estimating excess hazard and relative survival in large population-based studies"}.
Their command,
{stata `"view net describe st0330, from("http://www.stata-journal.com/software/sj14-1")"':strel2},
can be downloaded from Stata Journal's software site,
but does not weem to have been updated since 2014.
{marker cgdsite}
{title:Installing CGD programs in STATA}
{p 4 4 2}
Since May, 2009, it is possible for anyone with a copy of Stata
anywhere in the world to install some CGD programs directly
from CGD's server. Click {net "from http://digital.cgdev.org/doc/stata":here}
{p 4 4 2}
Or in the command window of Stata, type:
{p 6 4 2}
{cmd:view net from "http://digital.cgdev.org/doc/stata"}
{p 4 4 2}
or in the viewer just type:
{p 6 4 2}
{cmd: net from "http://digital.cgdev.org/doc/stata"}
{p 4 4 2}
and follow directions. The advantage of installing utilities using STATA's {help net} approach
is that you can automatically look for updates to all your installed utilities
by just typing {help ado update}. (In Stata 15 or earlier, {help adoupdate}.)
{p 4 4 2}
Programs or packages on CGD's external site can be found by typing:
{p 6 4 2}
{cmd: findit {it:program_name}}
{p 4 4 2}
or
{p 6 4 2}
{cmd: search {it:program_name}}
{p 4 4 2}
Stata help on maintaining this system can be found at: {help usersite}
{title:Directory tools by Ulrich Kohler and Nick Winter}
{p 4 4 2}
Dirtools is a package of directory navigation tools for use inside Stata.
To install, type -ssc install dirtools-.
Once installed, type {help dirtools} for an index to the different programs.
Some people may prefer to navigate using Nick Winter's package -fastcd-,
which can be installed by typing -ssc install fastcd-.
The package -dirtools- is intended to be compatible and to build on the package
-fastcd-, but in my experience the two packages are not compatible
in more recent versions of Stata. Choose one or the other.
{marker sample}{...}
{title:Sample size & power calculation for simple and complex sample designs}
{p 4 4 2}
Beginning with version 13, Stata has an entire manual
called {mansection PSS intro:PSS}, which is devoted to computing
statistical {help power} and designing samples.
{p 4 4 2}
In addition several users have written sample design and power calculation programs
for more complex situations. For example, {help nstage} is a user-written suite
of programs for multi-arm, multi-stage sample designs. Originally written in 2009,
Patrick Royston and coauthors have posted an updated version which you can
download
{net "describe nstage, from(http://www.homepages.ucl.ac.uk/~ucakjpr/stata)":here}
{p 4 4 2}
A new user-written menu-driven package for cluster designs is {help clustersampsi},
which has been updated in 2014 on Stata's site. Install the updated version
{net "describe st0286_1, from(http://www.stata-journal.com/software/sj14-3)":here}
{p 4 4 2}
Stata has also made it easy for users to write their own special purpose power commands
as is documented in the {mansection PSS intro:PSS} manual and in this
help file: {help power_userwritten}.
{p 4 4 2}
The old command {help powerreg} and the 2011 command {help sampsi_reg} extend Stata's {help sampsi} analysis to
regression models. The 2014 commands by Aberson go further see
{browse "http://www.stata-journal.com/article.html?article=st0342": Aberson (2014)}, while those by
{browse "http://www.stata-journal.com/article.html?article=st0329": Batistatou, Roberts and Roberts (2014)}
handle cluster designs.
{marker nonlinpredict}
{title:Computing and plotting predicted values from a nonlinear regression}
{p 4 4 2}
Stata 11's new {help fvlist:factor variables} are described under the heading "Transformations & Expressions" above.
Using the factor variable notation allows one to perform a regression on a cubic polynomial as follows:
{p 6 4 2}
{cmd:sysuse auto, clear}
{p 6 4 2}
{cmd:gen kprice = price/1000}
{p 6 4 2}
{cmd:lab var kprice "Price of car in thousands of dollars"}
{p 6 8 2}
{cmd:regress mpg c.kprice##c.kprice##c.kprice i.foreign i.rep78 turn trunk headroom}
{p 4 4 2}
where {cmd:mpg} is defined as the car's mileage rate in miles-per-gallon and {cmd:kprice} is the selling price.
{p 4 4 2}
Now suppose we want to compute and graph the fitted value of {cmd:mpg} as a function of {cmd:kprice},
holding constant the other right-hand-side variables at their means (or at some other user-selected values).
Knowing that price varies in the data from 3 to 15 thousand dollars,
we can use Stata 11's {help margins} command to compute the values of the predicted value
at each integer value of kprice from 3 to 15.
{p 6 8 2}
{cmd:margins, at(kprice=(3/15)) vsquish post}
{p 4 4 2}
Then following the advice of Jeff Pitblado of StataCorp (see {browse "http://www.stata.com/news/statanews.25.3.pdf"} ),
we can use Roger Newson's {help parmest} command to generate a new data set
of the predicted values.
{p 6 8 2}
{cmd:parmest, norestore evec(at) rename(ev_1 kprice)}
{p 4 4 2}
(N.B.: By adding the {cmd: evec(at) rename(ev_1 kprice)} options, one can
avoid re-generating the kprice variable.) Now we can graph the
predicted values against -kprice- like this:
{p 6 8 2}
{cmd:twoway rarea max95 min95 kprice, pstyle(ci) || line estimate kprice}
{p 4 4 2}
While the {help parmest} command remains useful as a general tool to extract coefficients estimates,
standard errors and confidence intervals, Ben Jann's program {help coefplot},
discussed in greater detail {help mead_favorites##tables:above}, is more powerful if one's objective is to
plot the coefficients themselves.
{p 4 4 2}
Patrick Royston has since contributed to the Stata Journal
{browse "https://journals.sagepub.com/doi/pdf/10.1177/1536867X1301300305":issue 13:3}
an elegant command which automates the above process and
has the advantage of leaving the data intact. His command
{search marginscontplot2,net:marginscontplot2},
which is the 2018 update of his orignal command
{cmd:marginscontplot},
can be used to produce the same graph
as the above code using {cmd: parmest} by immedately following
the above regression command with the commmands:
{p 6 8 2}
{cmd:sysuse auto, clear}
{p 6 8 2}
{cmd:gen kprice = price/1000}
{p 6 8 2}
{cmd:lab var kprice "Price of car in thousands of dollars"}
{p 6 8 2}
{cmd:regress mpg c.kprice##c.kprice##c.kprice i.foreign i.rep78 turn trunk headroom}
{p 6 8 2}
{cmd:marginscontplot2 kprice, ci}
{p 4 4 2}
Additional documentation on the use of {help marginscontplot}
and {help marginscontplot2}, especially in combination with {help mkspline},
is available from Richard Williams (2021)
{browse "https://www3.nd.edu/~rwilliam/stats3/Margins03.pdf":here}.
{p 4 4 2}
A different and sometimes more useful approach to estimating the predictd values from estimation results
with non-linear terms in the right-hand-side variables is presented in
Rios-Avila's {help f_able} command which is discussed in greater detail
above under the section of this help file on
{help mead_favorites##expressions:Transformations, functions & expressions}.
{marker categorical}{...}
{title:Working with categorical variables & value labels}
{p 4 4 2}
Some variables are measured continuously, such as age GDP or temperature. Continuous
variables are natural to Stata. Stata calls all such variables {help datatypes:"numeric"}.
Stata's basic statistical commmands like {help summarize} or {help regress}
work naturally with numeric variables.
{p 4 4 2}
Other variables are categorical, such as sex or country. Categorical variables
are most naturally represented by text such as "Male"/"Female" or
"Denmark"/"Chile"/"Botswana"/"Vietnam"/... .
For non-numeric variables, Stata has a type of variable called a
"string variable". If you attempt to apply a numerical statistical operation
such as {help summarize} or {help regress} to a string variable, Stata will
complain that you have "no observations".
{p 4 4 2}
While Stata can do many things with string variables, sometimes it needs a numerical
variable in order to proceed. And sometimes naturally categorical variables like "country"
or "religion" are stored as integers in a survey data repository. In these cases,
it is essential to know how to convert a string variable to a numerical code
and then to attach a "value label" to each numerical code so generated.
Furthermore, when importing data from another Statistical package,
it's often necessary or desireable to rationalize the value labels so that
Stata can best use and display them.
{p 4 4 2}
Stata has a set of utilities to address these issues. See for example, Stata's
help for {help encode}, {help decode}, {help labelbook},
{help _labels2names}, {help _strip_labels} and {help _restore_labels}.
To manipulate value labels in more sophisticated ways,
see the user-written packages of utilities {help labutil},
{help labutil2}, {help labeldup} and {help labelrename}.
New in SJ 19(4) is Dan Klein's {cmd: elabel} package.
The latest version is available on the SSC site
{net "describe elabel, from(http://fmwww.bc.edu/RePEc/bocode/e)":here}.
{p 4 4 2}
When the data contain two variables that identify the same units,
one of which is numeric and the other string,
the command {help labmask} is particularly useful. As an example, consider the
{browse "https://www.iso.org/iso-3166-country-codes.html":International Standard for country codes},
which provides a three-digit numeric code for each country.
From this source, one can create two Stata variables,
one of which is the "short country name" and the other a three-digit ISO-certified numeric code.
Suppose you name the Stata variables {it:cname} and {it:ccode}.
When Stata insists on a numeric code for the country,
one could use Stata's {help encode} command to create a numeric version of
the string variable, {it:cname}. But the numeric variable so created would be different
from the official three-digit ISO code. (For example, {help encode} would assign
Afghanistan the numeric code "1", while the official iso code for
Afghanistan is actually "8".)
{help labmask} is a simple soultion to this problem.
To prevent confusion from having
two numeric codes for the same country, one would instead use the command:
{p 8 4 2}
{bf:labmask {it:ccode}, values({it:cname}) lblname({it:cname})}
{p 4 4 2}
which creates a set of value labels called {it:cname} and attaches them as value labels
to the numeric variable {it:ccode}. (Hat tip to Julian Duggan.)
{p 4 4 2}
Sometimes one needs to map ranges of a numeric variable into values of a categorical variable.
A shortcut for this procedure is to use Stata's {help egen}
with the sub-command {cmd:cut()}. UCLA's Statistical Consulting service
has a helpful tutorial on using {cmd:egen {it:newvar} = cut(...)}
{browse "https://stats.idre.ucla.edu/stata/faq/how-can-i-recode-continuous-variables-into-groups/":here}.
If you are using an old version of Stata into which Stata has not yet
added the {cmd:cut()} sub-command, you can install the add-on
{stata `"view net describe dm66_1, from("http://www.stata.com/stb/stb50")"':here}.
{it}Caveat emptor{sf} on such an old community contributed add-on.
{p 4 4 2}
Sometimes one's data on multiple variables arrives in a long string.
To parse such a string, extracting usable individual string or numerical values,
Stata's {help string functions} are key. Several of the {help egen} functions
discussed {help mead_favorites##expressions:above} manipulate strings
also see the {help moss} comand for finding multiple occurrences of substrings
and the FAQ on how to parse regular expressons
{browse "https://www.stata.com/support/faqs/data-management/regular-expressions/":here}.
For an example of this issue and proposals for its resolution, see
{browse "https://www.statalist.org/forums/forum/general-stata-discussion/general/1603351-calculation-for-string-variables":Statalist here}.{p_end}
{marker catdep}{...}
{title:Interpreting models with limited or categorical dependendent variables}
{p 4 4 2}
The statistics text by Long and Freese (2014) presents a range of useful techniques for estimating and,
especially, interpreting the results of regression models for which the dependent
variable is either a dummy (i.e. binary) variable or an ordered or unordered
categorical (i.e. polytomous) variable. The 2010 Stata Journal article by Jann & Long (2010)
shows how the user can display {help spost13} results using Ben Jann's {help estout}
package.
{p 4 5 10}
References:
{p 4 5 10}
{browse "https://www.stata.com/bookstore/regression-models-categorical-dependent-variables/":J. Scott Long & Jeremy Freese. 2014. Regression Models for Categorical Dependent Variables Using Stata, 3rd Edition. College Station, TX: Stata Press}
{p 4 5 10}
{browse "http://www.stata-journal.com/article.html?article=st0183":B. Jann & J.S. Long, Tabulating SPost results using estout and esttab, Stata Journal, Volume 10 Number 1: pp. 46-60.}
{p 4 4 2}
As of the summer of 2017, the updated
program can be installed
{stata "view net describe spost13_ado, from(http://www.indiana.edu/~jslsoc/stata)":here.}
{marker workflow}{...}
{title:Workflow organization}
{p 4 4 2}
Long's other very useful Stata book explains how to program and how to organize the "workflow" for a Stata project.
See {browse "http://www.stata.com/bookstore/workflow-data-analysis-stata/": Workflow book}, which is recommended
by Michael Clemens.
{title:Stochastic Frontier and Data envelopment analysis}
{p 4 4 2}
Stata has two Stata-supported SFA commands, {help frontier} and {help xtfrontier}.
In addition there is a new user-written command with extended capabilities called {help sfcross}.
For stochastic frontier models with endogenous right-hand-side variables, see {help sfkk}.
And the programs {help tenonradial} from Stata Journal issue Volue 16, Number 3 will
estimate non-parameteric stochastic frontier models. There are two new user-written
routines for DEA. They can be found here: {help dea} or here {help orderalpha}
{title:Simulation tools}
{p 4 4 2}
Bill Gould presented a technique for constructing a multivariate normal dataset from scratch in order to match
a desired correlation matrix. His FAQ on this is here: {browse "http://www.stata.com/support/faqs/stat/mvnorm.html"}}
{p 4 4 2}
Howver, his approach is no longer necessary thanks to the new Stata command: {help drawnorm}. Do not confuse this
command with {help corr2data} which is not usually suitable for use in simulation studies.
{marker spatial}{...}
{title:Spatial data analysis in Stata}
{p 4 4 2}
{hi:Descriptive analysis and cartography.}
Beginning with version 15 of Stata, spatial analysis is now a formal, supported
part of Stata. If you don't have Stata 15, you can preview Stata 15's spatial
capabilities {browse "https://www.stata.com/new-in-stata/spatial-autoregressive-models/":here}.
From inside Stata 15, see {help spatial} for a quick introduction and overview
with links to the new-in-version-15 Stata SP manual.
{p 4 4 2}
Pisati's {browse "http://www.stata.com/meeting/italy12/abstracts/materials/it12_pisati.pdf":2012 presentation}
gives a good introduction to using his {help shp2dta} which is now superceded
by Stata's own command {help spshape2dta}.
{p 4 4 2}
Some useful independent web sites are those by {browse "http://huebler.blogspot.ca/2012/08/stata-maps.html":Friedrich Huebler}
and an anonymous {browse "http://statadaily.ikonomiya.com/2011/03/20/fun-with-maps-in-stata/":Filipino woman}
who gives the code for producing a map of the Philippines.
{p 4 4 2}
To install an older suite of geospatial mapping and regression tools for Stata,
including those mentioned above, click on "install spatial" on this
{net "from http://digital.cgdev.org/doc/stata/MO/Misc":menu}.
{p 4 4 2}
{hi:Inferential analysis and geospatial econometrics.}
Geospatial statistical analysis requires a weight matrix, which in general
is of dimension N x N, where N is the number of individuals or spatiallly distinct
objects in the analysis. Stata's ability to construct, manipulate, store
and manage large weight matrices took a giant leap forward with the
publication in 2013 of {help spmat}, which has now been incorporated in
formal Stata as {help spmatrix}.
{marker difndif}
{title:Causal Treatment effects}
{p 4 4 2}
Stata 16 has an entire {help teffects_intro:manual} of commands for the
estimation of the treatment effects of an experimental or quasi-experimental intervention.
Useful complementary programs contributed by the community include
various programs for automating difference-in-differences (or "dif-in-dif")
estimates of treatment effects. These include
Juan Villa's {net "describe diff, from(http://fmwww.bc.edu/RePEc/bocode/d)":diff},
Mora and Reggio's {net "describe didq, from(http://fmwww.bc.edu/RePEc/bocode/d)":didq},
Chaisemartin et al's {net "describe fuzzydid, from(http://fmwww.bc.edu/RePEc/bocode/f)":fuzzydid},
Cerulli and Ventura's {net "describe tvdiff, from(http://fmwww.bc.edu/RePEc/bocode/t)":tvdiff},
Fernando Rios-Avila et al.'s {net "describe drdiff, from(http://fmwww.bc.edu/RePEc/bocode/d)":drdiff} and
{net "describe csdiff, from(http://fmwww.bc.edu/RePEc/bocode/c)":csdiff},
and {search difference-in-difference:others}. A related contribution by Niels Bruun which also visualizes a treatment effect is his
{net "describe emc, from(http://fmwww.bc.edu/RePEc/bocode/e)":emc} program
described in his SJ 19(3) article.
{p 4 4 2}
The World Bank's "DIME Project" maintains a set of Stata tools for impact evaluation
on GitHub which can be described and installed
{net "describe ietoolkit, from(http://fmwww.bc.edu/RePEc/bocode/i)":here}.
For an introduction to the DIME approach to causal analysis and a link to its GitHub repository go
{browse "https://dimewiki.worldbank.org/Stata_Coding_Practices#ietoolkit":here}.
For more links to discussions if impact evelauation methods, see the landing page for
{browse "https://blogs.worldbank.org/impactevaluations":DIME blogs}.
{p 4 4 2}
Methods and Stata programs for using synthetic controls and regression discontinuity are presented at the
2021 NBER Methods lecture by Alberto Abadie, Matias Cattaneo and Rocio Titiunik.
The lecture is posted
{browse "https://www.nber.org/lecture/summer-institute-2021-methods-lectures-causal-inference-using-synthetic-controls-and-regression":here}.
The updated version of the package -rdrobust- is available
{net "describe rdrobust, from(http://fmwww.bc.edu/RePEc/bocode/r)":here}.
The package was first distributed with the 2017 Stata Journal article:
{browse "https://www.stata-journal.com/article.html?article=st0366_1":here}.
To see a list of all the programs by Matias Cattaneo and Rocio Titiunik et. al.
that are related to methods for causal inference, search with the following Stata command:{p_end}
{p 8 12 2}
{stata search Cattaneo, a}{p_end}
{p 4 4 2}
or the command:{p_end}
{p 8 12 2}
{stata search Titiunik, a}.{p_end}
{p 4 4 2}
Daniel Pailanir has ported to Stata/Mata from R the methods for synthetic Difference-in-Differences
introduced Arkhangelsky et al (2021). While his program is available on SSC
{stata `"view net describe sdid, from(`"http://fmwww.bc.edu/repec/bocode/s"')"':here},
he advises that the latest version be downloaded from his
{browse "https://github.com/Daniel-Pailanir/sdid":GitHub repository} with the
{net "describe github, from(https://haghish.github.io/github/)":github}
command like this:{p_end}
{p 8 12 2}
{stata github search sdid}{p_end}
{p 4 4 2}
Instead of using the Stata {help github} command, you can instead use Stata's
{help net} command to access and then install {help sdid} directly from GitHub like this:{p_end}
{p 8 12 2}
{stata `"view net from `"https://raw.github.com/Daniel-Pailanir/sdid/master"'"':net from `"https://raw.github.com/Daniel-Pailanir/sdid/master"'}{p_end}
{p 4 4 2}
Wrapper programs that facilitate the synthetic control approach to impact evaluation include:{p_end}
{p 8 12 2}
{stata `"net from `"https://raw.github.com/bquistorff/synth_runner/master"'"':synth_runner} from GitHub{p_end}
{p 8 12 2}
{stata `"view net describe st0619,from("http://www.stata-journal.com/software/sj20-4")"':npsynth} same version v16 as on SSC, but with more ancillary files.{p_end}
{p 4 4 2}
Citations on synthetic control include:{p_end}
{phang}
Abadie, A., Diamond, A., and J. Hainmueller. 2015.
{browse "https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1950298":Comparative Politics and the Synthetic Control Method},
American Journal of Political Science, Vol. 59, No. 2, pp. 495-510.{p_end}
{phang}
Abadie, A., Diamond, A., and J. Hainmueller. 2010.
{browse "https://web.stanford.edu/~jhain/Paper/JASA2010.pdf":Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco Control Program}.
{it: Journal of the American Statistical Association} 105(490): 493-505.{p_end}
{phang}
Abadie, A. and Gardeazabal, J. 2003.
{browse "https://www.nber.org/papers/w8478":Economic Costs of Conflict:A Case Study of the Basque Country}.
American Economic Review 93(1): 113-132.{p_end}
{phang}
Arkhangelsky, Dmitry, Susan Athey, David A. Hirshberg, Guido W. Imbens, and Stefan Wager.
{browse "https://www.aeaweb.org/articles?id=10.1257/aer.20190159":Synthetic Difference in Differences},
American Economic Review, December 2021.{p_end}
{phang}
Galiani, S., B. Quistorff.
{browse "https://www.stata-journal.com/article.html?article=st0500":The synth_runner package: Utilities to automate synthetic control estimation using synth},
Stata Journal, 17:44, pp. 834-849{p_end}
{marker mediation}{...}
{title:Mediation analysis}
{p 4 4 2}
A useful set of tools for parcelling out an observed association among several
channels by which it might be mediated is provided by the user written program
{help medsens}. A 2013 presentation describing the technique can be found
{browse "https://www.stata.com/meeting/italy13/abstracts/materials/it13_grotta.pdf":here}.
{p 4 4 2}
But Michael Clemens suggests that the command {search b1x2} written by
Jonah Gelbach is a superior approach. Gelbach's exposition of this command is
{browse "http://www.journals.uchicago.edu/doi/abs/10.1086/683668":here}.
{marker decomposition}{...}
{title:Decomposition Analysis (Blinder-Oaxaca & extensions)}
{p 4 4 2}
An important tool for analyzing the policy implications of various right-hand-side
variables is called
{browse "https://en.wikipedia.org/wiki/Blinder%E2%80%93Oaxaca_decomposition":Blinder-Oaxaca decomposition analysis}.
The authoritative overview of decomposition techniques is Fortin, Lemieux and Firpo's Chapter 1from volume 4a of the Handbook of Labor Economics:
{browse "https://www.oreilly.com/library/view/handbook-of-labor/9780444534507/OEBPS/S0169721811004072.htm":[Link to the gated reference]}
{browse "https://www.nber.org/papers/w16045":[Link to the working paper version downloadable from NBER]}
{p 4 4 2}
The 2008 Stata journal
{browse "https://www.stata-journal.com/sjpdf.html?articlenum=st0151":article}
on the Blinder-Oaxaca decomposition is written by Ben, Jann,
one of Stata's most productive and cleverest community contributors.
(See the SJ editors' description of his contributions when they awarded him the
{browse "https://journals.sagepub.com/doi/10.1177/1536867X1801700401":Stata Journal Editors' Prize 2017}.)
Jann's 2008
{browse "https://www.stata-journal.com/sjpdf.html?articlenum=st0151":article}
documents a Stata program Jann has written called “oaxaca”.
While the technique was originally developeded for a linear regression,
Yun (2004) has extended the technique to non-linear models.
(Jann's article gives the reference.)
Following Yun (2004), Jann has included options to perform the decomposition
for logit or probit models. Instead of installing “oaxaca” from the
Stata Journal site, I suggest installing it from SSC,
because the Jann has updated the program through 2011 on SSC.
The Stata command for installing from inside Stata is:
{stata ssc install oaxaca}.
Note that you can also retrive the auxiliary files in order to replicate
the examples in the help file and in the SJ article.
{p 4 4 2}
Sinning, Hahn and Bauer (2008)
{browse "https://journals.sagepub.com/doi/pdf/10.1177/1536867X0800800402":provide}
an alternative Stata implementation of the Blinder-Oaxaca decomposition method,
extended to apply to any non-linear regression model, including the logit model.
The first author, Mathias Sinning, presented their {help nldecompose} Stata
program at Stata Users Group meeting in 2007. Here’s the Powerpoint presentation
version of their 2008 SJ
{browse "https://www.stata.com/meeting/5german/SINNING_stata_presentation.pdf":article}.
The installation command from inside Stata for the updated May 2009 version of {help nldecompose} is:
{stata view net sj 9-2 st0152_1}.
This command is also accompanied by auxiliary files.
{marker plausexog}{...}
{title:Plausible exogeneity}
{p 4 4 2}
New-in-2017, the program {help plausexog} tests the sensitivity of an estimated coefficient
to plausible bounds on the exogeneity of an instrumental variable.
A 2017 paper describing the technique can be found
{browse "https://ideas.repec.org/c/boc/bocode/s457832.html":here}. Earlier references
are by
{browse "http://mayoral.iae-csic.org/IV_2015/conley_etal.pdf":Conley, Hansen & Rossi (2012)}
and
{browse "http://documents.worldbank.org/curated/en/976041468315833446/pdf/wps4632.pdf":Kraay (2008)}.
{marker dataaccess}{...}
{title:Data Access Utilities}
{p 4 4 2}
To check the version of Stata that can read a given Stata {help dta} file,
use the Stata utility {help dtaversion}.
{p 4 4 2}
New-in-2018, the program {help sdmxuse} facilitates
downloading data from a variety of international organizations into a Stata formatted
data file. Institutions include not only the World Bank, but also the IMF,
the OECD, the IMF, the United Nations, Eurostat and the European Central Bank.
The article can be accessed
{browse "https://www.stata-journal.com/article.html?article=dm0097&utm_source=MailingList&utm_medium=email&utm_content=20181217_sj_18_4_+fname":here}.
However, some users have had some difficulties using {help sdmxuse}.
Look for comments on Statalist.
{p 4 4 2}
The 2016 program {help wbopendata} is a more carefully and completely documented
utility which is expressly designed to download World Bank data. An abstract
describing the program is available on the Ideas RePEC site
{browse "https://ideas.repec.org/c/boc/bocode/s457234.html":here}.
An older program with the same objective is {search wdireshape}.
{p 4 4 2}
The World Inequality Database provides data that can be converted to Stata
formatted {help dta} files. The website is {browse "https://wid.world/":here}.
{browse "https://www.wm.edu/as/economics/faculty-directory/parman_j.php":John Parman}
provides a useful tutorial on accessing this data
{browse "https://jmparman.people.wm.edu/stata-tutorials/historical-income-and-wealth-distributions.html":here}.
He advises installing a 2020 utility named {stata view net describe wid:wid}
written by {browse "https://thomasblanchet.fr/":Thomas Blanchet}.
Also see Blanchet's GiHub site on inequality {browse "https://github.com/WIDworld/wid-stata-tool":here}.
{marker makeado}{...}
{title:Programming an ADO file in Stata ({it:Under construction})}
{p 4 4 2}
A PowerPoint to introduce ADO-file programming can be found here:
{browse "http://fmwww.bc.edu/GStat/docs/StataProg.pdf": Introduction to Programming in Stata}.{p_end}
{p 4 4 2}
The following blogs by Stata's David Drukker give
clear and complete presentations on how to make a user-friendly Stata ADO file:{p_end}
{p 8 4 2}
{browse "https://blog.stata.com/2016/01/15/programming-an-estimation-command-in-stata-a-map-to-posted-entries/"}{p_end}
{p 4 4 2}
Or search for all blog entries on programming here:{p_end}
{p 8 4 2}
{browse "https://blog.stata.com/category/programming/"}{p_end}
{p 4 4 2}
A useful utility for use with Stata's syntax command is the undocumented
{help opts_exclusive}. Documentation for several other "undocumented" commands
is available by typing {help undocumented:help undocumented}.
{p 4 4 2}
The benefits from writing one's own Stata ado files are multiplied exponentially by using {help macros:macros}.
There are many resources for learning how to use Stata's macros. Among them is entry #3
in David Drukker's blog{p_end}
{p 8 4 2}
{browse "https://blog.stata.com/2015/11/03/programming-an-estimation-command-in-stata-global-macros-versus-local-macros/"}{p_end}
{p 4 4 2}
Macros work together with lists, both of which are managed with the help of {help quotes:quotation marks}.
One important application of macros is their use in "loops" as described in Nick Cox's
{browse "https://journals.sagepub.com/doi/10.1177/1536867X20976340":"Speaking Stata: Loops, again and again", Stata Journal, 2020}
{p 4 4 2}
Often it is useful to assemble or manipulate a list of character strings.
These might be strings without spaces which follow Stata's
{mansection U 11.3Namingconventions:naming conventions}, such as the names of variables, matrices, scalars or macros.
Or they might be the names of e.g. countries, some of which have spaces.
A further complication arises if some of the character strings contain single, double or compound quotes.
Stata includes a set of {help help macrolists:utilities for manipulating lists}.
Useful discussions on Statalist include the following two threads started by
{browse "https://www.statalist.org/forums/forum/general-stata-discussion/general/1384687-list-and-compound-double-quotes":Belinda Foster}
and by
{browse "https://www.statalist.org/forums/forum/general-stata-discussion/general/1575339-quotes-and-local-macros":Jasmine Davidson}.
The latter thread contains the comment by William Lisowski that:
"[The Stata command] -macro list {it:macname}- is the only reliable way of seeing exactly what is stored in a macro."
To refer to a local {it:macname} after -macro list-, prefix it by an underscore character.
Otherwise -macro list- assumes it is a global macro.
Post
{browse "https://www.statalist.org/forums/forum/general-stata-discussion/general/1575339-quotes-and-local-macros?p=1575408#post1575408":#5 by Bjarte Aagnes}
is helpful in demonstrating how to construct a list of strings all of which are enclosed in quotes. As shown in that post,
the line in the following code that defines the local macro -newvlist- does not work,
because the double quotes are stripped from the first element of the macro -newvlist-.
But the line of code defining -newvlist2- does work, because of the compund double quotes.
The command -macro list _newvlist2-
shows that the first element of the list is -"price"- as it should be to work properly
in the function {help inlist()}.
sysuse auto, clear
foreach v of varlist price length mpg turn make {
local newvlist `newvlist' `comma' "`v'"
local newvlist2 `"`newvlist2' `comma' "`v'""'
local comma ,
}
macro list _newvlist
cap noi di inlist("mpg",`newvlist')
macro list _newvlist2
di inlist("mpg",`newvlist2')
{marker usersites}{...}
{title:Other repositories of user written programs}
{p 4 4 2}
For a clickable list of all user sites that Stata can locate with its webcrawler,
consult "SJ and other community-contributed commands" under
the pull-down Help dialogue at the top of the Stata Results window or
click {net "from http://www.stata.com/users/":here}
{p 4 4 2}
To find all programs contributed by current or former CGD-affiliated Stata contributors,
from inside Stata type {stata search CGD}. Or
click {net "from http://digital.cgdev.org/doc/stata":here}
{p 4 4 2}
Ben Jann is a major contributior of user-written programs, some of which are referenced
in this help file.
(See the SJ editors' description of his contributions when they awarded him the
{browse "https://journals.sagepub.com/doi/10.1177/1536867X1801700401":Stata Journal Editors' Prize 2017}.)
His site can be reached at
http://repec.sowi.unibe.ch/stata/index.html by clicking
{browse "http://repec.sowi.unibe.ch/stata/index.html":here}
Stata's own help facility identifies many of Ben Jann's programs,
and gives links to the Stata Journal articles documenting many of those programs,
if you type:
{search jann, a:search jann, a}.
{p 4 4 2}
J. Scott Long's two books presenting his Stata programs are
{browse "https://www.stata.com/bookstore/workflow-data-analysis-stata/":"The Workflow of Data Analysis Using Stata"} and
{browse "https://www.stata-press.com/books/regression-models-categorical-dependent-variables/":"Regression mdoels for categorical dependent variables"}.
To download any of his programs from these two books, open Stata's viewer and type:
{net "from http://www.indiana.edu/~jslsoc/stata":net from "http://www.indiana.edu/~jslsoc/stata"}.
{p 4 4 2}
Roger Newson has moved to
{browse "https://www.statalist.org/forums/forum/general-stata-discussion/general/1617814-new-version-of-dolog-on-ssc":King's College London}.
His Stata repository is
{net "from http://www.rogernewsonresources.org.uk":here}.
{p 4 4 2}
{browse "https://github.com/friosavila":Fernando Rios-Avila}
has several GitHub repositories, including one devoted to extensions of Stata's visualization tools
{browse "https://github.com/friosavila/stataviz":here}.
He usually posts his programs on {help ssc:Boston College's SSC} site.
{p 4 4 2}
Researchers at the World Bank use Stata for their own research and also for
collaboration with researchers in client countries. The World Bank's
Poverty Analysis Toolkit and other useful Stata programs are described
{browse "http://web.worldbank.org/WBSITE/EXTERNAL/TOPICS/EXTPOVERTY/EXTPA/0,,contentMDK:20202162~menuPK:430402~pagePK:148956~piPK:216618~theSitePK:430367,00.html":here (Accessed 19Aug2019)}.
Available tools include
{browse "http://web.worldbank.org/archive/website01411/WEB/0__CONTE.HTM":ADePT}
which can be downloaded here
{browse "http://web.worldbank.org/archive/website01411/WEB/0__CON-3.HTM":here}.
Stata impact evaluation resources from World Bank authors are on
{browse "https://worldbank.github.io/stata/":GitHub}. Also see the book
{browse "https://www.worldbank.org/en/programs/sief-trust-fund/publication/impact-evaluation-in-practice":"Impact Evaluation in Practice"}.
One of the World Bank contributors is Misha Lokshin.
Look for commands he has written or co-authored here {search lokshin, a: Lokshin}.
For an introduction to the DIME approach to causal analysis and a link to its GitHub repository go
{browse "https://dimewiki.worldbank.org/Stata_Coding_Practices#ietoolkit":here}.
A former DIME contributor who has moved to Georgetown University is Ben Daniels.
{browse "https://github.com/bbdaniels/":Daniels' GitHub site }
is a repository for many useful Stata programs, some of which he has not yet posted to SSC.
The easiest way to install a subset of his packages is from his
{stata `"view net from "https://raw.githubusercontent.com/bbdaniels/stata/main""':Stata compatible download site}.
{p_end}
{p 4 4 2}
Thanks to Haghish, E. F. (2020) The many Stata packages on GitHub can now be more easily downloaded and installed.
See his journal article:{p_end}
{p 8 12 2}
Haghish, E. F. (2020). Developing, maintaining, and hosting Stata statistical software on GitHub.
{browse "https://journals.sagepub.com/doi/abs/10.1177/1536867X20976323":The Stata Journal}, 20(4), 931-951.
{p 4 4 2}
or his {browse "https://github.com/haghish":GitHub site} for details. Or install his program
{stata `"view net describe github, from(https://raw.githubusercontent.com/haghish/github/master)"':github here}.{p_end}
{p 4 4 2}
If you know the name of a Stata ADO file available on GitHub,
finding and installing it is simplified
by the command {cmd:github search}. For example, the program {search btable} by
{browse "https://github.com/aghaynes":Alan Haynes}
cannot be found with Stata's search command. But by typing:
{p 8 4 2}
{stata github search btable}
{p 4 4 2}
we see the following display.
{col 4}{hline 82}
{col 4} {bf:Repository}{col 19}{bf:Username}{col 31}{bf:Install}{col 40}{bf:Description}
{col 4}{hline 82}
{col 4} {bf:{browse "http://github.com/CTU-Bern/btable":btable}}{col 19}{browse "http://github.com/CTU-Bern":CTU-Bern}{col 31}{stata github search btable:Install}{col 40}Stata package for baseline tables
{col 31}{it:132k}{col 40}updated on 2022-10-10
{col 40}{bf:Fork:}2{col 50}{bf:Star:}0{col 60}{bf:Lang:}Stata{col 77}
{col 4} {hline 82}
{p 4 4 2}
Clicking on {browse "http://github.com/CTU-Bern":CTU-Bern} under {cmd:Username} takes one to the GitHub user site.
Clicking on {stata github search btable:Install} adds the program to one's own Stata installation.
{p 4 4 2}
If a GitHub Stata site has a -stata.toc- file which lists Stata -pkg- files,
it can be accessed from inside Stata with a {help net:net from} command.
For example Keith Kranker's archived adofile site on GitHub is most easily available
like this:{p_end}
{p 4 4 2}
{stata `"view net from "https://raw.githubusercontent.com/kkranker/kk-adofiles/master""'}
{p 4 4 2}
Brian Quistorff's two GitHub Stata repositories are accessible from Stata by the following two lines of code.
(The programs themselves are located in the alphabetical subdirectories.){p_end}
{p 4 4 2}
{stata `"view net from "https://raw.githubusercontent.com/bquistorff/Stata-modules/master""'}{p_end}
{p 4 4 2}
{stata `"view net from "https://raw.githubusercontent.com/bquistorff/stata-reproducible/master""'}{p_end}
{p 4 4 2}
Thomas Grund's suite of commands for network analysis are available here:{p_end}
{p 4 4 2}
{stata `"view net from "https://raw.githubusercontent.com/ThomasGrund/nwcommands/master""'}{p_end}
{marker authors}{...}
{title:Author}
{phang}Contact {browse "http://www.cgdev.org/expert/mead-over/":Mead Over} at:
{browse "mailto:mover@cgdev.org":MOver@CGDev.Org} with problems or suggestions.{p_end}
{* ver. 2.1 27Jan2015 Add reference to grc1leg }{...}
{* ver. 2.2 4Feb2015 Enhance references to introductory material, remove references to N: }{...}
{* ver. 2.3 27Feb2015 Update sections on power calculations and contour plotting.}{...}
{* ver. 2.4 20Apr2015 Update section on making tables.}{...}
{* ver. 2.5 19May2015 Add superscatter program, jump to capability.}{...}
{* ver. 2.6 19Jan2016 Additional table commands, edits thanks to Bill Lisowsky}{...}
{* ver. 2.7 11Feb2016 Properly credit Luke Gallup and outreg}{...}
{* ver. 2.8 20Aug2017 Fix references to "http://digital.cgdev.org" }{...}
{* ver. 2.9 6Sep2017 Add links to spost13}{...}
{* ver. 2.10 27Sep2017 Add references to Stata 15 and to mediation and plausexog.}{...}
{* ver. 2.11 23Nov2017 Add references to putdocx and putpdf.}{...}
{* ver. 2.12 21Feb2018 Add reference to Winter's -fastcd-.}{...}
{* ver. 2.13 3Mar2018 Add discussion of schemes and pointer to scheme_s2clr_on_white.scheme.}{...}
{* ver. 2.14 9Jan2019 Add Data Access Utilities.}{...}
{* ver. 2.15 18Feb2019 Add links to using special characters in graphs, help graph_text.}{...}
{* ver. 2.16 24Feb2019 Fix link to PovCal. Add link to Ben Jann's "Overview" web page}{...}
{* ver. 2.17 25Feb2019 Add link to Nick Cox's 2009 rank comparison SJ article}{...}
{* ver. 2.18 21Jun2019 Add links to labutil, labutil2}{...}
{* ver. 2.20 30Jul2019 Add links to oaxaca, nldecompose}{...}
{* ver. 2.21 19Aug2019 Edit -user sites-}{...}
{* ver. 2.22 16Jan2020 Delete reference to -tojpg- which is now part of -gr export-} {...}
{* ver. 2.23 3Mar2020 Add section on matrix utilities, including -mlu-}{...}
{* ver. 2.24 12Nov2020 Add to section on databases}{...}
{* ver. 2.25 8Feb2021 Revise and set a marker for "Comparing data sets and variables"} {...}
{* ver. 2.26 8Mar2021 Add the replicability section at the beginning and incorporate the -repshow- and -genl- commands.}{...}
{* ver. 2.27 14Mar2021 Add references to polar graphs and to Asjad Naqvi's tutorials}{...}
{* ver. 2.28 15Mar2021 Added link to egen newvar = cut() & Chris Baum's 2015 PPT on ado programming}{...}
{* Have deleted reference to David Roodman's -collapse2- which is now part of Stata}{...}
{* ver. 2.29 19Mar2021 Added discussion of -egen- capabilities and of -egenmore- under transformations and expressions}{...}
{* ver. 2.30 29Mar2021 Added discussion of -tabplot-}{...}
{* ver. 2.31 7Apr2021 Added reference to "Matrices as look-up tables" and to _gwmean()}{...}
{* ver. 2.32 16Apr2021 paragraph on string functions, especially -moss- and -do2screen-}{...}
{* ver. 2.33 28Apr2021 Add mentions of -wdireshape- and -egen_inequal-}{...}
{* ver. 2.34 13May2021 Add mention of labmask }{...}
{* ver. 2.35 25May2021 Add mention of labgen2 as update to genl. Remove all reference to -retainlbl-}{...}
{* ver. 2.36 27Jun2021 Add link to FAQ on factor variable support}{...}
{* ver. 2.37 4Jul2021 Improve discussion of table-making commands and add -asdoc-, -asdocx-.}{...}
{* ver. 2.38 7Jul2021 Add reference to Roger Newson's Stata site.}{...}
{* ver. 2.39 19Jul2021 Add Ben Jann's suggested strategy for landscape output in MS Word.}
{* ver. 2.40 6Sep2021 Add Jann's -coefplot- and update discussion of Roystan's -marginscontplot2-.}
{* ver. 2.41 1Oct2021 Add section on ADO programing at {marker makeado} UNDER CONSTRUCTION}
{* ver. 2.42 8Dec2021 Add reference to Richard Williams note on using Royston's (2013) -mcp2-}
{* ver. 2.43 30Dec2021 Add reference to Stata's new -etable- command, superseding -est table- and -esttab-}
{* ver. 2.44 29Apr2022 Add to the resources for "Causal Treatment Analysis"}
{* ver. 2.45 30Apr2022 Add GitHub utilities and references and links to -sdid-}
{* ver. 2.46 11Jul2022 Add resources on macros, lists and adornment.}
{* ver. 2.47 7Aug2022 Add resources on using margins with transformed variables.}
{* ver. 2.48 19Oct2022 Revise section on -table-making- to include -tabout-}
{* TO DO: Add section on -sem-, -gsem-, MIMIC and -cmp-.}
{* ver. 2.49 25Oct2022 Include mention of -egenodd-}
{* ver. 2.50 1Nov2022 Include mention of -gph2xl-, -gph2csv-, Daniels' GitHub site}
{* ver. 2.51 4Nov2022 Fuller description of accessing GitHub programs using -github-}
{* ver. 2.52 9Dec2022 Add references to Cox's -nicelabels- and -mylabels-}
{* ver. 2.53 22Mar2023 Add RiosAvila's DID commands -drdid- and -csdid- to Causal section}
{* ver. 2.54 6May2023 Add Cox's -adjacent-, -egen ... adjl()-, -egen ... adju()-}