Germán Rodríguez
Generalized Linear Models Princeton University

3.6 Multi-factor Models: Model Selection

We now move to an analysis using all three predictors: age, desire for no more children, and education, which is grouped into two categories: lower primary or less, and upper primary or more. The data can be read from the datasets section. We will use the wider format, that has 16 rows; one for each combination of the three predictors, with columns for users and nonusers. (The last section shows how one can obtain the same results using the longer format, that has 32 rows: one for each combination of the three predictors and response, with a column for the frequency of that combination.)

. use https://grodri.github.io/datasets/cusew, clear    
(Contraceptive Use Data (Fiji, 1976))

. gen n = users + nonusers

. list, sep(4)

     ┌────────────────────────────────────────────────────────────────────────┐
     │   age                    educ          nomore   nonusers   users     n │
     ├────────────────────────────────────────────────────────────────────────┤
  1. │   <25   Lower primary or less      Wants more         53       6    59 │
  2. │   <25   Lower primary or less   Wants no more         10       4    14 │
  3. │   <25   Upper primary or more      Wants more        212      52   264 │
  4. │   <25   Upper primary or more   Wants no more         50      10    60 │
     ├────────────────────────────────────────────────────────────────────────┤
  5. │ 25-29   Lower primary or less      Wants more         60      14    74 │
  6. │ 25-29   Lower primary or less   Wants no more         19      10    29 │
  7. │ 25-29   Upper primary or more      Wants more        155      54   209 │
  8. │ 25-29   Upper primary or more   Wants no more         65      27    92 │
     ├────────────────────────────────────────────────────────────────────────┤
  9. │ 30-39   Lower primary or less      Wants more        112      33   145 │
 10. │ 30-39   Lower primary or less   Wants no more         77      80   157 │
 11. │ 30-39   Upper primary or more      Wants more        118      46   164 │
 12. │ 30-39   Upper primary or more   Wants no more         68      78   146 │
     ├────────────────────────────────────────────────────────────────────────┤
 13. │ 40-49   Lower primary or less      Wants more         35       6    41 │
 14. │ 40-49   Lower primary or less   Wants no more         46      48    94 │
 15. │ 40-49   Upper primary or more      Wants more          8       8    16 │
 16. │ 40-49   Upper primary or more   Wants no more         12      31    43 │
     └────────────────────────────────────────────────────────────────────────┘

We start by considering models that treat age as a factor with four categories, fertility preferences using an indicator for wanting no more, and educational level using an indicator for upper primary or more. Because we have only three variables we are able to fit all possible models, which provides a nice check on the usual model building strategies using forward selection and backward elimination.

The Deviance Table

Let us reproduce Table 3.13, which compares all possible one, two and three-factor models.

The first step will be to fit the model with the three-factor interaction, which is saturated for the 2x4x2x2 table of contraceptive use by age, education, and desire for more children. We will store the log-likelihood and d.f. in two scalars, and save the fitted values for later plotting.

. quietly glm users i.age##c.nomore##c.educ, family(binomial n)

. scalar slogL = e(ll)

. scalar sdf = e(df_m)

. predict obs3, xb   // 3-way model

::: Next we are going to fit 16 different models. Given the repetitive nature of the calculations it pays to plan in advance. I will create three variables to store the name, deviance and d.f. of each model, using a string of up to 36 characters for the model name. :::

. set obs 17
Number of observations (_N) was 16, now 17.

. gen str36 model = ""
(17 missing values generated)

. gen deviance = .
(17 missing values generated)

. gen df = .
(17 missing values generated)

We then write a simple command that takes as arguments the name and specification of the model, fits it, and stores the name, deviance and d.f. in the three variables just defined, using a global macro n to keep track of the number of the row where the results will be stored.

. capture program drop mfit

. program mfit
  1. version 11
  2. args model formula
  3. quietly glm users `formula', family(binomial n)
  4. global n = $n + 1
  5. quietly replace model = "`model'" in $n
  6. quietly replace deviance = e(deviance) in $n
  7. quietly replace df = sdf - e(df_m) in $n
  8. end

Finally I initialize the row number to 0 and fit the models, taking care to enclose the model name and formula in quotes to ensure that they are treated as just two arguments rather than split into words

. set obs 17
Number of observations (_N) was 17, now 17.

. global n = 0

. // one-factor models
. mfit Age i.age

. mfit Education educ

. mfit "NoMore" nomore

. // two-factor additive models
. mfit "Age + Education" "i.age educ"

. mfit "Age + NoMore"  "i.age nomore"

. mfit "Education + NoMore" "educ nomore"

. // two-factor interactions
. mfit "Age * Education" "i.age##c.educ"

. mfit "Age * NoMore" "i.age##c.nomore"

. mfit "Education * NoMore" "c.educ##c.nomore"

. // three-factor additive model
. mfit "Age + Education + NoMore" "i.age c.educ c.nomore"

. // one interaction
. mfit "Age * Education + NoMore" "i.age##c.educ nomore"

. mfit "Age * NoMore + Education" "i.age##c.nomore educ"

. mfit "Age + Education * NoMore" "i.age c.nomore##c.educ"

. // two interactions
. mfit "Age * (Education + NoMore)" "i.age##c.educ i.age##c.nomore"

. mfit "Education * (Age + NoMore)" "i.age##c.educ c.educ##c.nomore"

. mfit "NoMore * (Age + Education)" "i.age##c.nomore c.educ##c.nomore"

. // three interactions
. mfit "Age*Educ+Age*NoMore+Educ*NoMore" ///
>   "i.age##c.nomore c.educ i.age#c.educ c.educ#c.nomore"

Done, let’s print the results, using only two decimal places for the deviances

. format deviance %6.2f

. list model deviance df if !missing(deviance), clean

                                 model   deviance   df  
  1.                               Age      86.58   12  
  2.                         Education     165.07   14  
  3.                            NoMore      74.10   14  
  4.                   Age + Education      80.42   11  
  5.                      Age + NoMore      36.89   11  
  6.                Education + NoMore      73.87   13  
  7.                   Age * Education      73.03    8  
  8.                      Age * NoMore      20.10    8  
  9.                Education * NoMore      67.64   12  
 10.          Age + Education + NoMore      29.92   10  
 11.          Age * Education + NoMore      23.15    7  
 12.          Age * NoMore + Education      12.63    7  
 13.          Age + Education * NoMore      23.02    9  
 14.        Age * (Education + NoMore)       5.80    4  
 15.        Education * (Age + NoMore)      13.76    6  
 16.        NoMore * (Age + Education)      10.82    6  
 17.   Age*Educ+Age*NoMore+Educ*NoMore       2.44    3  

Please refer to the notes for various tests based on these models. You should be able to test for net effects of each factor given the other two, test each of the interactions, and test the goodness of fit of each model. We now examine three models of interest.

The Three-factor Additive Model

We fit again the three-factor additive model, so we can show the parameter estimates reflecting the net effect of each factor. The gross effects of age and desire or no more children have been shown in earlier sections.

. glm users i.age educ nomore, family(binomial n)

Iteration 0:   log likelihood = -50.793156  
Iteration 1:   log likelihood = -50.712573  
Iteration 2:   log likelihood = -50.712565  
Iteration 3:   log likelihood = -50.712565  

Generalized linear models                         Number of obs   =         16
Optimization     : ML                             Residual df     =         10
                                                  Scale parameter =          1
Deviance         =  29.91722173                   (1/df) Deviance =   2.991722
Pearson          =  28.28833641                   (1/df) Pearson  =   2.828834

Variance function: V(u) = u*(1-u/n)               [Binomial]
Link function    : g(u) = ln(u/(n-u))             [Logit]

                                                  AIC             =   7.089071
Log likelihood   =  -50.7125647                   BIC             =   2.191335

─────────────┬────────────────────────────────────────────────────────────────
             │                 OIM
       users │ Coefficient  std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
         age │
      25-29  │   .3893816   .1758501     2.21   0.027     .0447219    .7340414
      30-39  │   .9086135   .1646211     5.52   0.000     .5859621    1.231265
      40-49  │   1.189239     .21443     5.55   0.000     .7689639    1.609514
             │
        educ │   .3249947   .1240355     2.62   0.009     .0818894    .5680999
      nomore │   .8329548   .1174705     7.09   0.000     .6027169    1.063193
       _cons │  -1.966169   .1720307   -11.43   0.000    -2.303343   -1.628995
─────────────┴────────────────────────────────────────────────────────────────

. di exp(_b[educ])
1.3840232

Contraceptive use differs by each of these factors, even when we compare women who fall in the same categories of the other two. For example the odds of using contraception are 38% higher among women with upper primary or more, compared to women with lower primary or less, in the same age group and category of desire for more children.

The deviance of 29.92 on 10 d.f. tells us that this model does not fit the data, so the assumption that logit differences by one variable are the same across categories of the other two is suspect.

The Model with One Interaction Effect

Of the three models with one interaction term, the one that achieves the largest improvement in fit compared to the additive model is the model with an age by no more interaction, where the difference in logits between women who want no more children and those who do varies by age.

The standard reference-cell parametrization can easily be obtained using factor variables:

. glm users educ i.age##c.nomore, family(binomial n)

Iteration 0:   log likelihood = -42.121099  
Iteration 1:   log likelihood = -42.068742  
Iteration 2:   log likelihood =  -42.06873  
Iteration 3:   log likelihood =  -42.06873  

Generalized linear models                         Number of obs   =         16
Optimization     : ML                             Residual df     =          7
                                                  Scale parameter =          1
Deviance         =  12.62955292                   (1/df) Deviance =   1.804222
Pearson          =  13.06280922                   (1/df) Pearson  =   1.866116

Variance function: V(u) = u*(1-u/n)               [Binomial]
Link function    : g(u) = ln(u/(n-u))             [Logit]

                                                  AIC             =   6.383591
Log likelihood   = -42.06873029                   BIC             =  -6.778568

─────────────┬────────────────────────────────────────────────────────────────
             │                 OIM
       users │ Coefficient  std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
        educ │   .3406479   .1257653     2.71   0.007     .0941525    .5871432
             │
         age │
      25-29  │   .3946039   .2014504     1.96   0.050    -.0002315    .7894394
      30-39  │   .5466635   .1984206     2.76   0.006     .1577663    .9355607
      40-49  │   .5795235   .3474172     1.67   0.095    -.1014017    1.260449
             │
      nomore │   .0662197   .3307064     0.20   0.841    -.5819529    .7143922
             │
age#c.nomore │
      25-29  │     .25918   .4097504     0.63   0.527    -.5439161    1.062276
      30-39  │   1.112662   .3740433     2.97   0.003     .3795507    1.845773
      40-49  │   1.361674   .4843256     2.81   0.005     .4124134    2.310935
             │
       _cons │  -1.803172   .1801786   -10.01   0.000    -2.156315   -1.450028
─────────────┴────────────────────────────────────────────────────────────────

. di exp(_b[nomore]), exp(_b[4.age#c.nomore]), ///
>   exp(_b[nomore] + _b[4.age#c.nomore]) 
1.0684614 3.902721 4.1699068

Make sure you know how to interpret all of these coefficients. For example the ratio of the odds of using contraception among women who want no more children relative to those who want more in the same category of education is 1.07 among women under age 25, but 3.9 times more (giving an odds ratio of 4.1) among women in their forties.

To aid in interpretation and model criticism we can plot the observed and fitted logits, effectively reproducing Figure 3.4. Because we will need more than one plot, I will encapsulate the calculations in a command pof, for plot observed and fitted.

. capture program drop pof

. program pof
  1. args obs fit more
  2. // build labels for the curves using fit at last age
. local labels `""nomore/upper" "nomore/lower" "more/upper" "more/lower""'
  3. local text 
  4. forvalues i=1/16 {
  5.     if age[`i'] != 4 continue
  6.     local k =  4 - 2*nomore[`i'] - educ[`i'] 
  7.     local label: word `k' of `labels'
  8.     local value = round(`fit'[`i'], 0.01)
  9.     local text `text' `value' 45.5 "`label'"
 10. }
 11. // plot observed and fitted logits
. twoway ///
>   (scatter `obs' agem if educ==1 & nomore==1, ms(S) mc(red)) ///
>   (scatter `obs' agem if educ==0 & nomore==1, ms(T) mc(red)) ///
>   (scatter `obs' agem if educ==1 & nomore==0, ms(O) mc(green)) ///
>   (scatter `obs' agem if educ==0 & nomore==0, ms(D) mc(green) ) ///      
>   (line `fit' agem if educ==1 & nomore==1, lp(solid) lc(red))  ///
>   (line `fit' agem if educ==0 & nomore==1, lp(dash) lc(red) ) ///
>   (line `fit' agem if educ==1 & nomore==0, lp(solid) lc(green)) ///
>   (line `fit' agem if educ==0 & nomore==0, lp(dash) lc(green)) ///
>   , title("Contraceptive Use by Age, Education, and Preferences") ///      
>     xtitle(age) ytitle(logit)  legend(off) `more' ///
>     xsc(range(15 50)) text(`text', placement(r) size(vsmall)) 
 12. end

The plot combines four scatterplots and four line plots, one for each subgroup defined by education and desire for more children. The command takes as arguments the names of variables with the observed and fitted value and an optional string to be passed along as an option to the graph twoway command. I use the same markers as in the notes, but with what I hope is a better legend.

So here’s our first plot

. recode age (1=20) (2=27.5) (3=35) (4=45), gen(agem)
(16 differences between age and agem)

. gen obs = log(users/nonusers)
(1 missing value generated)

. predict lfit31, xb
(1 missing value generated)

. pof obs lfit31 "subtitle(Model with Age by Preferences Interaction)"

. graph export fig34.png, width(500) replace
file fig34.png saved as PNG format

I often find that interpretation of the interactions is more direct if I combine them with the main effects. Here is the same model showing the difference in logits by desire for more children in each age group, reproducing the results in Table 3.15

. glm users i.age i.age#c.nomore educ, family(binomial n)

Iteration 0:   log likelihood = -42.121099  
Iteration 1:   log likelihood = -42.068742  
Iteration 2:   log likelihood =  -42.06873  
Iteration 3:   log likelihood =  -42.06873  

Generalized linear models                         Number of obs   =         16
Optimization     : ML                             Residual df     =          7
                                                  Scale parameter =          1
Deviance         =  12.62955292                   (1/df) Deviance =   1.804222
Pearson          =  13.06280922                   (1/df) Pearson  =   1.866116

Variance function: V(u) = u*(1-u/n)               [Binomial]
Link function    : g(u) = ln(u/(n-u))             [Logit]

                                                  AIC             =   6.383591
Log likelihood   = -42.06873029                   BIC             =  -6.778568

─────────────┬────────────────────────────────────────────────────────────────
             │                 OIM
       users │ Coefficient  std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
         age │
      25-29  │   .3946039   .2014504     1.96   0.050    -.0002315    .7894394
      30-39  │   .5466635   .1984206     2.76   0.006     .1577663    .9355607
      40-49  │   .5795235   .3474172     1.67   0.095    -.1014017    1.260449
             │
age#c.nomore │
        <25  │   .0662197   .3307064     0.20   0.841    -.5819529    .7143922
      25-29  │   .3253996   .2419217     1.35   0.179    -.1487581    .7995574
      30-39  │   1.178882   .1748169     6.74   0.000      .836247    1.521517
      40-49  │   1.427894   .3538467     4.04   0.000     .7343668    2.121421
             │
        educ │   .3406479   .1257653     2.71   0.007     .0941525    .5871432
       _cons │  -1.803172   .1801786   -10.01   0.000    -2.156315   -1.450028
─────────────┴────────────────────────────────────────────────────────────────

. estimates store mint

. di exp(_b[educ])
1.4058581

. mata exp(st_matrix("e(b)")[5..8])
                 1             2             3             4
    ┌─────────────────────────────────────────────────────────┐
  1 │  1.068461402   1.384583879   3.250737129   4.169906768  │
    └─────────────────────────────────────────────────────────┘

.     

We find 34% higher odds of using contraception among women with some education, compared to women with no education in the same age group and category of desire. We also see that the odds of using contraception among women who want no more children are higher than among women who want more children in the same age and category of education, 7% higher under age 25, 38% higher at age 25-29, three times as high for women in their thirties, and four times as high among women in their forties.

This model passes the conventional goodness of fit test and therefore provides a reasonable description of contraceptive use by age, education, and desire for more children.

All Three Two-Factor Interactions

As explained in the notes, there is some evidence that education may interact with the other two variables. The model with all three two-factor interactions provides the best fit, with a deviance of 2.44 on three d.f., but is substantially more complex.

Rather than present parameter estimates, I will reproduce Figure 3.5, which provides some hints on how the model could be simplified. Thanks to our pof command this is now an easy task:

. quietly glm users i.age educ nomore ///
> i.age#c.educ i.age#c.nomore c.educ#c.nomore, family(binomial n)

. predict lfit32, xb 
(1 missing value generated)

. pof obs lfit32 "subtitle(All Two-Factor Interactions)"

. graph export fig35.png, width(500) replace        
file fig35.png saved as PNG format

A picture is indeed worth a thousand words. We see that among women who want no more children, contraceptive use increases almost linearly (in the logit scale) with age, with no differences by education except in the oldest age group, where use flattens for women with no education. Among women who do want more children, contraceptive use is generally lower, and increases more slowly with age; there are some differences by education, and these are higher among older women. There’s also a hint of curvature by age for women with no education who want more children.

A Parsimonious Model

These observations suggest ways to simplify the model. The age interactions are quite simple: the increase with age is steeper among women who want no more children, and the difference by education is larger among women in their forties. Similarly, the educational difference is larger in use for spacing and among older women.

One way to capture these features is to use a quadratic on age, allow the slope (but not the curvature) to vary by desire for more children, and introduce effects of education only for spacing and after age 40 (and thus not for limiting before age 40). To facilitate interpretation of the resulting parameters I center age around 30:

. gen agemc   = agem - 30
(1 missing value generated)

. gen agemcsq = agemc^2
(1 missing value generated)

. gen educ_spacers = educ * (1-nomore)
(1 missing value generated)

. gen educ_forties = educ * (age==4)
(1 missing value generated)

So here is a more parsimonious model

. glm users c.agemc##c.nomore agemcsq c.educ_spacers educ_forties, ///
>   family(binomial n)    

Iteration 0:   log likelihood = -38.692269  
Iteration 1:   log likelihood = -38.686338  
Iteration 2:   log likelihood = -38.686338  

Generalized linear models                         Number of obs   =         16
Optimization     : ML                             Residual df     =          9
                                                  Scale parameter =          1
Deviance         =  5.864768279                   (1/df) Deviance =   .6516409
Pearson          =  5.950841275                   (1/df) Pearson  =   .6612046

Variance function: V(u) = u*(1-u/n)               [Binomial]
Link function    : g(u) = ln(u/(n-u))             [Logit]

                                                  AIC             =   5.710792
Log likelihood   = -38.68633797                   BIC             =  -19.08853

─────────────┬────────────────────────────────────────────────────────────────
             │                 OIM
       users │ Coefficient  std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
       agemc │   .0247552   .0118884     2.08   0.037     .0014543    .0480561
      nomore │   .9804174   .1790475     5.48   0.000     .6294907    1.331344
             │
     c.agemc#│
    c.nomore │    .058961   .0183799     3.21   0.001     .0229371    .0949849
             │
     agemcsq │  -.0034306   .0010318    -3.32   0.001    -.0054529   -.0014083
educ_spacers │    .432112   .1808991     2.39   0.017     .0775563    .7866677
educ_forties │   .9798156   .3462926     2.83   0.005     .3010945    1.658537
       _cons │  -1.339265   .1578254    -8.49   0.000    -1.648597   -1.029933
─────────────┴────────────────────────────────────────────────────────────────

This model has only seven parameters and a deviance of 5.9 on 9 d.f., so it is much simpler than the previous model and fits pretty well. Obviously we can’t take the test seriously because we didn’t specify these terms in advance, but the exercise shows how one can simplify a model by capturing its essential features. Before we interpret the coefficients let us check the fitted values

. predict lfit33, xb
(1 missing value generated)

. pof obs lfit33 "subtitle(A Simplified Model)"

. graph export fig35b.png, width(500) replace
file fig35b.png saved as PNG format

We see that the model provides almost the same fit as the much more complex model of the previous subsection.

. di exp(_b[agemc])
1.0250641

. di exp(_b[nomore]), exp(_b[c.agemc#c.nomore])
2.6655686 1.0607339

. di exp(_b[educ_spacers]), exp(_b[educ_forties])
1.5405076 2.6639649

Returning to the parameter estimates, we see that contraceptive use generally increases with age, with an increment in the odds of about 2.5 percent per year at age 30 (less at younger and older ages, with differences noted below after age 40). Use is much higher among women who want no more children, with an odds ratio of 2.7 at age 30, increasing about six percent per year of age. Women with some education are more likely to use contraception for spacing purposes, with an odds ratio of 1.5, and are also more likely to use for either spacing or limiting after age 40, with an odds ratio of 2.7 (which makes the odds ratio by education for spacers after age 40 just above four).

Alternative model simplifications are given in the notes.

Weighted Individual Data

As promised, we show briefly how one can obtain the same results using a dataset with one row for each combination of predictors and response, with a column indicating the frequency of that combination, effectively simulating individual data. This lets us use the logit or logistic commands in Stata.

We will illustrate the equivalence using the model with a main effect of education and an interaction between age and wanting no more children, which we kept as mint.

. //    use https://grodri.github.io/datasets/cuse, clear
. use cuse, clear
(Contraceptive Use Data (Fiji, 1976))

. gen nomore = desire==1

. list in 1/4

     ┌──────────────────────────────────────────────────────────────────┐
     │ age                    educ          desire   cuse    n   nomore │
     ├──────────────────────────────────────────────────────────────────┤
  1. │ <25   Lower primary or less      Wants more     No   53        0 │
  2. │ <25   Lower primary or less      Wants more    Yes    6        0 │
  3. │ <25   Lower primary or less   Wants no more     No   10        1 │
  4. │ <25   Lower primary or less   Wants no more    Yes    4        1 │
     └──────────────────────────────────────────────────────────────────┘

. logit cuse educ i.age i.age#c.nomore [fw=n]

Iteration 0:   log likelihood = -1001.8468  
Iteration 1:   log likelihood = -926.33767  
Iteration 2:   log likelihood = -925.27593  
Iteration 3:   log likelihood = -925.27536  
Iteration 4:   log likelihood = -925.27536  

Logistic regression                                     Number of obs =  1,607
                                                        LR chi2(8)    = 153.14
                                                        Prob > chi2   = 0.0000
Log likelihood = -925.27536                             Pseudo R2     = 0.0764

─────────────┬────────────────────────────────────────────────────────────────
        cuse │ Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
        educ │   .3406479   .1257653     2.71   0.007     .0941525    .5871432
             │
         age │
      25-29  │   .3946039   .2014504     1.96   0.050    -.0002315    .7894394
      30-39  │   .5466635   .1984206     2.76   0.006     .1577663    .9355607
      40-49  │   .5795235   .3474172     1.67   0.095    -.1014017    1.260449
             │
age#c.nomore │
        <25  │   .0662197   .3307064     0.20   0.841    -.5819529    .7143922
      25-29  │   .3253996   .2419217     1.35   0.179    -.1487581    .7995574
      30-39  │   1.178882   .1748169     6.74   0.000      .836247    1.521517
      40-49  │   1.427894   .3538467     4.04   0.000     .7343668    2.121421
             │
       _cons │  -1.803172   .1801786   -10.01   0.000    -2.156315   -1.450028
─────────────┴────────────────────────────────────────────────────────────────

. estimates store mintlong

. estimates table mint mintlong, equation(1) se 

─────────────┬──────────────────────────
    Variable │    mint       mintlong   
─────────────┼──────────────────────────
         age │
      25-29  │  .39460393    .39460393  
             │  .20145036    .20145036  
      30-39  │  .54666349    .54666349  
             │  .19842057    .19842057  
      40-49  │  .57952354    .57952354  
             │  .34741724    .34741724  
             │
age#c.nomore │
        <25  │  .06621967    .06621967  
             │  .33070636    .33070636  
      25-29  │  .32539965    .32539965  
             │  .24192168    .24192168  
      30-39  │  1.1788818    1.1788818  
             │  .17481688    .17481688  
      40-49  │  1.4278937    1.4278937  
             │  .35384674    .35384674  
             │
        educ │  .34064785    .34064785  
             │  .12576526    .12576526  
       _cons │ -1.8031718   -1.8031718  
             │  .18017863    .18017863  
─────────────┴──────────────────────────
                            Legend: b/se

As you can see, the estimates and standard errors are exactly the same as before. The deviance is different because in this dataset the saturated model would have a separate parameter for each woman. We can reproduce the deviance of 12.63 on 7 d.f. given earlier, by computing the difference in deviances (or twice the difference in log-likelihoods) between the model with a three factor interaction and this model:

. quietly logit cuse i.educ##i.age##c.nomore [fw=n]

. lrtest mintlong .

Likelihood-ratio test
Assumption: mintlong nested within .

 LR chi2(7) =  12.63
Prob > chi2 = 0.0817

Thus, working with grouped data gives exactly the same estimates as working with individual data, except of course for the deviances. Recall that deviances can be interpreted as goodness of fit tests only with grouped data, but differences in deviances between nested models can always be interpreted as likelihood ratio tests. We discuss how to test goodness of fit with individual data in Section 3.8.

Updated fall 2022