Germán Rodríguez
Generalized Linear Models Princeton University

8.3 Longitudinal Logits

We will work with a dataset on union membership used in the Stata manuals and in my own paper on intra-class correlation for binary data. This is a subsample of the National Longitudinal Survey of Youth (NLSY) and has union membership information from 1970-88 for 4,434 women aged 14-26 in 1968. The data are available in the Stata website

. use https://www.stata-press.com/data/r14/union, clear
(NLS Women 14-24 in 1968)

Logits

Here is a logit model

. gen southXt = south*year

. logit union age grade not_smsa south##c.year

Iteration 0:   log likelihood =  -13864.23  
Iteration 1:   log likelihood = -13547.326  
Iteration 2:   log likelihood = -13542.493  
Iteration 3:   log likelihood =  -13542.49  
Iteration 4:   log likelihood =  -13542.49  

Logistic regression                                     Number of obs = 26,200
                                                        LR chi2(6)    = 643.48
                                                        Prob > chi2   = 0.0000
Log likelihood = -13542.49                              Pseudo R2     = 0.0232

─────────────┬────────────────────────────────────────────────────────────────
       union │ Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
         age │   .0207656   .0050001     4.15   0.000     .0109655    .0305658
       grade │   .0489121   .0064308     7.61   0.000      .036308    .0615162
    not_smsa │  -.2195163   .0355977    -6.17   0.000    -.2892865   -.1497461
     1.south │  -1.530743   .4387964    -3.49   0.000    -2.390768   -.6707176
        year │  -.0145607   .0057128    -2.55   0.011    -.0257576   -.0033638
             │
south#c.year │
          1  │   .0110577   .0054775     2.02   0.044      .000322    .0217933
             │
       _cons │  -1.066526   .3414275    -3.12   0.002    -1.735711   -.3973401
─────────────┴────────────────────────────────────────────────────────────────

. estimates store logit

Fixed-Effects

Let us try a fixed-effects model first.

. xtlogit union age grade not_smsa south##c.year, i(id) fe
note: multiple positive outcomes within groups encountered.
note: 2,744 groups (14,165 obs) omitted because of all positive or
      all negative outcomes.

Iteration 0:   log likelihood = -4516.5881  
Iteration 1:   log likelihood = -4510.8906  
Iteration 2:   log likelihood =  -4510.888  
Iteration 3:   log likelihood =  -4510.888  

Conditional fixed-effects logistic regression        Number of obs    = 12,035
Group variable: idcode                               Number of groups =  1,690

                                                     Obs per group:
                                                                  min =      2
                                                                  avg =    7.1
                                                                  max =     12

                                                     LR chi2(6)       =  78.60
Log likelihood = -4510.888                           Prob > chi2      = 0.0000

─────────────┬────────────────────────────────────────────────────────────────
       union │ Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
         age │   .0710973   .0960536     0.74   0.459    -.1171643    .2593589
       grade │   .0816111   .0419074     1.95   0.051    -.0005259     .163748
    not_smsa │   .0224809   .1131786     0.20   0.843     -.199345    .2443069
     1.south │  -2.856488   .6765694    -4.22   0.000    -4.182539   -1.530436
        year │  -.0636853   .0967747    -0.66   0.510    -.2533602    .1259896
             │
south#c.year │
          1  │   .0264136   .0083216     3.17   0.002     .0101036    .0427235
─────────────┴────────────────────────────────────────────────────────────────

. estimates store fixed

Note how we lost 62% of our sample, down from 4434 to 1690 women. The 2744 drop outs are women who didn’t have variation in union membership over time. We will compare the estimates later.

Random-Effects

Now we fit a random-effects model:

. xtlogit union age grade not_smsa south##c.year, i(id) re nolog

Random-effects logistic regression                   Number of obs    = 26,200
Group variable: idcode                               Number of groups =  4,434

Random effects u_i ~ Gaussian                        Obs per group:
                                                                  min =      1
                                                                  avg =    5.9
                                                                  max =     12

Integration method: mvaghermite                      Integration pts. =     12

                                                     Wald chi2(6)     = 227.46
Log likelihood = -10540.274                          Prob > chi2      = 0.0000

─────────────┬────────────────────────────────────────────────────────────────
       union │ Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
─────────────┼────────────────────────────────────────────────────────────────
         age │   .0156732   .0149895     1.05   0.296    -.0137056     .045052
       grade │   .0870851   .0176476     4.93   0.000     .0524965    .1216738
    not_smsa │  -.2511884   .0823508    -3.05   0.002    -.4125929   -.0897839
     1.south │  -2.839112   .6413116    -4.43   0.000    -4.096059   -1.582164
        year │  -.0068604   .0156575    -0.44   0.661    -.0375486    .0238277
             │
south#c.year │
          1  │   .0238506   .0079732     2.99   0.003     .0082235    .0394777
             │
       _cons │  -3.009365   .8414963    -3.58   0.000    -4.658667   -1.360062
─────────────┼────────────────────────────────────────────────────────────────
    /lnsig2u │   1.749366   .0470017                      1.657245    1.841488
─────────────┼────────────────────────────────────────────────────────────────
     sigma_u │   2.398116   .0563577                      2.290162    2.511158
         rho │   .6361098   .0108797                      .6145307    .6571548
─────────────┴────────────────────────────────────────────────────────────────
LR test of rho=0: chibar2(01) = 6004.43                Prob >= chibar2 = 0.000

. estimates store random

Comparisons

Here’s a table comparing the estimates for logit, random and fixed-effects models. We use the equation option so Stata can match the estimates.

. estimates table logit random fixed, equation(1)

─────────────┬───────────────────────────────────────
    Variable │   logit        random       fixed     
─────────────┼───────────────────────────────────────
#1           │
         age │  .02076564    .01567318    .07109727  
       grade │   .0489121    .08708515    .08161106  
    not_smsa │ -.21951628   -.25118839    .02248093  
             │
       south │
          1  │ -1.5307427   -2.8391118   -2.8564878  
             │
        year │ -.01456067   -.00686044   -.06368531  
             │
south#c.year │
          1  │  .01105767    .02385056    .02641356  
             │
       _cons │ -1.0665258   -3.0093648               
─────────────┼───────────────────────────────────────
    /lnsig2u │               1.7493664               
─────────────┴───────────────────────────────────────

The main change is in the coefficient of not_smsa. You might think this indicates something wrong with the logit and random-effects models, but note that only women who have moved between standard metropolitan statistical areas and other places contribute to the fixed-effects estimate. It seems reasonable to believe that these women differ from the rest.

The random-effect coefficients are larger in magnitude than the ordinary logit coefficients. This is almost always the case. Omission of the random effect biases the coefficients towards zero.

Intra-class Correlation

The intra-class correlation of 0.636 in the random-effects model indicates a high correlation between a woman’s propensity to be a union member in different years, after controlling for education and residence.

My paper with Elo in the Stata journal shows how this coefficient can be interpreted in terms of an odds ratio, and translated into measures of manifest correlation such as Pearson’s r and Yule’s Q.

These measures can be calculated using xtrho, a post-estimation command available from the Stata journal website. In Stata type search xtrho, or net describe st0031, from(https://www.stata-journal.com/software/sj3-1). For the average woman the correlation between actual union membership in any two years is 0.42 using Pearson’s r and 0.78 using Yule’s Q.

Reference

Rodríguez, G. and Elo, I. (2003). “Intra-class correlation in random-effects models for binary data”. The Stata Journal,3(1):32-46. https://journals.sagepub.com/doi/pdf/10.1177/1536867X0300300102

Updated fall 2022