Stata makes the calculation of robust standard errors easy via the
vce(robust)
option. Replicating the results in R is not
exactly trivial, but Stack Exchange provides a solution, see replicating
Stata’s robust option in R.
So here’s our final model for the
program effort data using the vce(robust)
option in
Stata:
. use https://grodri.github.io/datasets/effort, clear (Family Planning Effort Data) . recode effort (min/4=1 "Weak") (5/14=2 "Moderate") (15/max=3 "Strong"), gen(effortg) (20 differences between effort and effortg) . regress change setting i.effortg, vce(robust) Linear regression Number of obs = 20 F(3, 16) = 28.29 Prob > F = 0.0000 R-squared = 0.8016 Root MSE = 5.732 ─────────────┬──────────────────────────────────────────────────────────────── │ Robust change │ Coefficient std. err. t P>|t| [95% conf. interval] ─────────────┼──────────────────────────────────────────────────────────────── setting │ .1692677 .0454318 3.73 0.002 .0729565 .2655788 │ effortg │ Moderate │ 4.143915 3.19122 1.30 0.213 -2.62117 10.909 Strong │ 19.44761 2.567472 7.57 0.000 14.00481 24.89041 │ _cons │ -5.954036 2.707697 -2.20 0.043 -11.6941 -.2139743 ─────────────┴────────────────────────────────────────────────────────────────
Here’s how to get the same result in R.
Basically you need the sandwich
package, which computes
robust covariance matrix estimators. You also need some way to use the
variance estimator in a linear model, and the lmtest
package is the solution. You will not get the same results as Stata,
however, unless you use the HC1
estimator; the default is
HC3
, for reasons explained in ?vcovHC
.
> library(haven) > fpe <- read_dta("https://grodri.github.io/datasets/effort.dta") > fpe$effortg = cut(fpe$effort, breaks=c(min(fpe$effort),5,15,max(fpe$effort)), + right=FALSE, include.lowest=TRUE, labels=c("Weak","Moderate","Strong")) > m <- lm(change ~ setting + effortg, data = fpe) > library(lmtest) > library(sandwich) > coeftest(m, vcov = vcovHC(m, type="HC1")) t test of coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -5.954036 2.707697 -2.1989 0.04294 * setting 0.169268 0.045432 3.7258 0.00184 ** effortgModerate 4.143915 3.191220 1.2985 0.21251 effortgStrong 19.447609 2.567472 7.5746 1.118e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The main point is that the results are exactly the same. Interestingly, some of the robust standard errors are smaller than the model-based errors, and the effect of setting is now significant.