Statistics and Population

Lecture Notes

Home Lecture Notes Stata Logs R Logs Datasets Problem Sets

6. Multinomial Response Models

We now turn our attention to regression models for the analysis of categorical dependent variables with more than two response categories. Several of the models that we will study may be considered generalizations of logistic regression analysis to polychotomous data. We first consider models that may be used with purely qualitative or nominal data, and then move on to models for ordinal data, where the response categories are ordered.

6.1 The Nature of Multinomial Data

Let me start by introducing a simple dataset that will be used to illustrate the multinomial distribution and multinomial response models.

6.1.1 The Contraceptive Use Data

Table 6.1 was reconstructed from weighted percents found in Table 4.7 of the final report of the Demographic and Health Survey conducted in El Salvador in 1985 (FESAL-1985). The table shows 3165 currently married women classified by age, grouped in five-year intervals, and current use of contraception, classified as sterilization, other methods, and no method.

Table 6.1. Current Use of Contraception By Age
Currently Married Women. El Salvador, 1985

Age	Contraceptive Method			All
	Ster.	Other	None
15–19	3	61	232	296
20–24	80	137	400	617
25–29	216	131	301	648
30–34	268	76	203	547
35–39	197	50	188	435
40–44	150	24	164	338
45–49	91	10	183	284
All	1005	489	1671	3165

A fairly standard approach to the analysis of data of this type could treat the two variables as responses and proceed to investigate the question of independence. For these data the hypothesis of independence is soundly rejected, with a likelihood ratio \( \chi^2 \) of 521.1 on 12 d.f.

In this chapter we will view contraceptive use as the response and age as a predictor. Instead of looking at the joint distribution of the two variables, we will look at the conditional distribution of the response, contraceptive use, given the predictor, age. As it turns out, the two approaches are intimately related.

6.1.2 The Multinomial Distribution

Let us review briefly the multinomial distribution that we first encountered in Chapter 5. Consider a random variable \( Y_i \) that may take one of several discrete values, which we index \( 1, 2, \ldots, J \). In the example the response is contraceptive use and it takes the values ‘sterilization’, ‘other method’ and ‘no method’, which we index 1, 2 and 3. Let

\[\tag{6.1}\pi_{ij} = \mbox{Pr}\{ Y_i = j \}\]

denote the probability that the \( i \)-th response falls in the \( j \)-th category. In the example \( \pi_{i1} \) is the probability that the \( i \)-th respondent is ‘sterilized’.

Assuming that the response categories are mutually exclusive and exhaustive, we have \( \sum_{j=1}^J \pi_{ij} = 1 \) for each \( i \), i.e. the probabilities add up to one for each individual, and we have only \( J-1 \) parameters. In the example, once we know the probability of ‘sterilized’ and of ‘other method’ we automatically know by subtraction the probability of ‘no method’.

For grouped data it will be convenient to introduce auxiliary random variables representing counts of responses in the various categories. Let \( n_i \) denote the number of cases in the \( i \)-th group and let \( Y_{ij} \) denote the number of responses from the \( i \)-th group that fall in the \( j \)-th category, with observed value \( y_{ij} \).

In our example \( i \) represents age groups, \( n_i \) is the number of women in the \( i \)-th age group, and \( y_{i1}, y_{i2}, \) and \( y_{i3} \) are the numbers of women sterilized, using another method, and using no method, respectively, in the \( i \)-th age group. Note that \( \sum_j y_{ij} = n_i \), i.e. the counts in the various response categories add up to the number of cases in each age group.

For individual data \( n_i=1 \) and \( Y_{ij} \) becomes an indicator (or dummy) variable that takes the value \( 1 \) if the \( i \)-th response falls in the \( j \)-th category and \( 0 \) otherwise, and \( \sum_j y_{ij} = 1 \), since one and only one of the indicators \( y_{ij} \) can be ‘on’ for each case. In our example we could work with the 3165 records in the individual data file and let \( y_{i1} \) be one if the \( i \)-th woman is sterilized and 0 otherwise.

The probability distribution of the counts \( Y_{ij} \) given the total \( n_i \) is given by the multinomial distribution

\[\tag{6.2}\mbox{Pr}\{Y_{i1}=y_{i1}, \ldots, Y_{iJ}=y_{iJ} \} = {n_i \choose y_{i1}, \ldots, y_{iJ} } \pi_{i1}^{y_{i1}} \ldots \pi_{iJ}^{y_{iJ}}\]

The special case where \( J=2 \) and we have only two response categories is the binomial distribution of Chapter 3. To verify this fact equate \( y_{i1}=y_i \), \( y_{i2}=n_i-y_i \), \( \pi_{i1}=\pi_i \), and \( \pi_{i2} = 1-\pi_i \).

Math rendered by