Germán Rodríguez
Generalized Linear Models Princeton University

2.7 Two-Way Analysis of Variance

We start by reading the data and recreating setting in groups. This time I’ll make the copying, recoding and labeling in just one command.

. use https://grodri.github.io/datasets/effort, clear
(Family Planning Effort Data)

. recode setting (min/69=1 "Low") (70/79=2 "Medium") (80/max=3 "High"), ///
>   gen(setting_g) label(setting_g)
(20 differences between setting and setting_g)

Let us now create a copy of program effort and group it into categories 0-4, 5-14 and 15+. We’ll call the variable effort_g for effort in groups.

. recode effort (0/4=1 "Weak") (5/14=2 "Moderate") ///
>   (15/max=3 "Strong"), gen(effort_g) label(effort_g)
(20 differences between effort and effort_g)

Here’s a table showing steeper declines in countries with strong programs, with a smaller difference between weak and moderate:

. tabulate effort_g, summarize(change)

  RECODE of │
     effort │ Summary of % Change in CBR between
   (Program │            1965 and 1975
    Effort) │        Mean   Std. dev.       Freq.
────────────┼────────────────────────────────────
       Weak │           5           4           7
   Moderate │   9.3333333    7.393691           6
     Strong │   27.857143   6.3358391           7
────────────┼────────────────────────────────────
      Total │        14.3   11.810343          20

An Additive Model

Let us fit a model treating both setting and effort as factor variables, with weak programs in low settings as the reference cell

. reg change i.setting_g i.effort_g

      Source │       SS           df       MS      Number of obs   =        20
─────────────┼──────────────────────────────────   F(4, 15)        =     13.55
       Model │  2075.80829         4  518.952073   Prob > F        =    0.0001
    Residual │   574.39171        15  38.2927807   R-squared       =    0.7833
─────────────┼──────────────────────────────────   Adj R-squared   =    0.7255
       Total │      2650.2        19  139.484211   Root MSE        =    6.1881

─────────────┬────────────────────────────────────────────────────────────────
      change │ Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
   setting_g │
     Medium  │  -1.680829   3.854967    -0.44   0.669    -9.897497    6.535839
       High  │   2.387565    4.45653     0.54   0.600    -7.111304    11.88643
             │
    effort_g │
   Moderate  │   3.836269   3.574561     1.07   0.300    -3.782727    11.45527
     Strong  │    20.6715   4.339232     4.76   0.000     11.42265    29.92036
             │
       _cons │   5.379275    3.10526     1.73   0.104     -1.23943    11.99798
─────────────┴────────────────────────────────────────────────────────────────

Compare these estimates with the results in Table 2.15 and the anova with Table 2.16 in the lecture notes.

Countries with strong family planning programs show steeper declines than countries with weak programs at the same level of social setting, on average 21 percentage points more.

This statement is based on the assumption of additivity, namely that the difference in outcomes across categories of program effort is the same at every level of setting. We will test this assumption below.

As you can see the differences in fertility change by level of effort, among countries with the same level of setting are significant, with an F-ratio of 11.5 on 2 and 15 d.f.

Fitted Values

Let us reproduce Table 2.17 in the notes, showing fitted means by grouped setting and effort. I will use predict to generate fitted values, and then tabulate them by the two relevant factors

. predict anovafit
(option xb assumed; fitted values)

. label var anovafit "Two-way anova fit"

. tabulate setting_g effort_g , summarize(anovafit) means

                        Means of Two-way anova fit

 RECODE of │
   setting │   RECODE of effort (Program
   (Social │            Effort)
  Setting) │      Weak   Moderate     Strong │     Total
───────────┼─────────────────────────────────┼──────────
       Low │ 5.3792748  9.2155437          . │ 7.5714285
    Medium │ 3.6984456  7.5347152  24.369947 │ 8.5999999
      High │ 7.7668395  11.603108  28.438341 │ 23.749999
───────────┼─────────────────────────────────┼──────────
     Total │ 5.0000001  9.3333331  27.857142 │      14.3

Can you get the missing cell in the upper right corner? How about the margins, can you get those values?

A Two-Factor Interaction

Let us now consider a model with an interaction between social setting and program effort, so differences in fertility decline by effort will vary by setting.

Stata understands a single or double hash, typed without spaces after a factor specification, to define models with interactions.

Let us fit the model using the double hash notation:

. regress change i.setting_g##i.effort_g
note: 1b.setting_g#3.effort_g identifies no observations in the sample.
note: 3.setting_g#3.effort_g omitted because of collinearity.

      Source │       SS           df       MS      Number of obs   =        20
─────────────┼──────────────────────────────────   F(7, 12)        =      8.15
       Model │     2189.45         7  312.778571   Prob > F        =    0.0009
    Residual │      460.75        12  38.3958333   R-squared       =    0.8261
─────────────┼──────────────────────────────────   Adj R-squared   =    0.7247
       Total │      2650.2        19  139.484211   Root MSE        =    6.1964

───────────────────┬────────────────────────────────────────────────────────────────
            change │ Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
───────────────────┼────────────────────────────────────────────────────────────────
         setting_g │
           Medium  │   3.333333    5.05937     0.66   0.522    -7.690086    14.35675
             High  │   6.333333   7.155029     0.89   0.393    -9.256136     21.9228
                   │
          effort_g │
         Moderate  │   8.583333   4.732607     1.81   0.095    -1.728132     18.8948
           Strong  │   19.33333   6.692917     2.89   0.014      4.75072    33.91595
                   │
setting_g#effort_g │
       Low#Strong  │          0  (empty)
  Medium#Moderate  │  -14.58333   8.578579    -1.70   0.115    -33.27445    4.107784
    Medium#Strong  │  -.3333333   9.797427    -0.03   0.973    -21.68009    21.01343
    High#Moderate  │  -6.583333   9.959379    -0.66   0.521    -28.28296    15.11629
      High#Strong  │          0  (omitted)
                   │
             _cons │   2.666667   3.577515     0.75   0.470    -5.128068     10.4614
───────────────────┴────────────────────────────────────────────────────────────────

Oops, our software dropped a term. Why? Because there are no countries with strong programs in low settings, so we have only eight groups, but are trying to represent their means using nine parameters, which is obviously one too many. Fortunately this doesn’t affect testing.

We can use the sum of squares to build the hierarchical anova in Table 2.21, and Stata can test the interaction for us, dropping automatically the redundant term:

. testparm i.setting_g#i.effort_g

 ( 1)  2.setting_g#2.effort_g = 0
 ( 2)  2.setting_g#3.effort_g = 0
 ( 3)  3.setting_g#2.effort_g = 0

       F(  3,    12) =    0.99
            Prob > F =    0.4318

We have no evidence that the differences by effort vary with social setting, with an F just below 1 on 3 and 12 d.f.

This makes the issue of interpreting parameters moot, but it may be worth noting briefly the problems caused by the empty cell. As things stand, the coefficient of moderate effort compares moderate with weak at low setting, but the coefficient of strong effort compares strong with weak at high setting. (Table 2.20 in the notes may help see this point. When the term for high and strong is dropped, the only difference between weak and strong programs at high setting is the coefficient of strong.)

The parametrization I like best for this problem combines the main effects of effort with the interactions, so we obtain differences between strong and weak, and between moderate and weak programs, at each level of setting. This allows us to omit the difference between strong and weak programs at low setting, which is the one we can’t identify. Try the specification below

reg change i.setting_g i.effort_g#i.setting_g

Dummy Variables

You could have obtained the same results in this unit using dummy variables. For the record, this is how you might build the dummies and fit the models. For the additive model we need just four indicators, two for each factor.

gen setting_med   = setting_g==2 // setting >= 70 & setting < 80
gen setting_high  = setting_g==3 // setting >= 80 & !missing(setting)
gen effort_mod    = effort_g ==2 // effort >= 5 & effort < 15
gen effort_strong = effort_g ==3 // effort >= 15 & !missing(effort)
regress change setting_med setting_high effort_mod effort_str

We need a total of four dummies to represent the interactions, which can be computed simply as the product of the indicators for the main effects. Here it is hard to come up with reasonable names in 12 characters or less.

gen se_med_mod = setting_med  * effort_mod
gen se_med_str = setting_med  * effort_str
gen se_hi_mod  = setting_high * effort_mod
gen se_hi_str  = setting_high * effort_str
regress change setting_med setting_high effort_mod effort_str ///
    se_med_mod se_med_str se_hi_mod se_hi_str

Our software will again omit a variable, but you have more control on what to drop. Can you figure out which dummies you would need to show the effects of effort at each level of setting?

Updated fall 2022