6. Multinomial Response Models

This section deals with regression models for discrete data with more than two response categories, where the assumption of a multinomial distribution is appropriate. We will consider multinomial logits for nominal data, and ordered logit models for ordinal data, with a brief mention of alternative-specific conditional logit models. We will also consider sequential logit models. (In line with the current syllabus we are skipping log-linear models for contingency tables, and thus their relationship with multinomial logit models.)

6.1 The Nature of Multinomial Data

We start by reading the data on contraceptive choice by age for currently married women in El Salvador, 1985, found in Table 6.1 of the lecture notes. The data are in “long” format with one row for each combination of predictor and response, showing the age group, method choice, and number of cases. In R we reshape the data so each method choice is a column, a layout that works better with the functions we will use.

> library(haven)
> library(tidyr)
> cuselong <- read_dta("https://grodri.github.io/datasets/elsalvador1985.dta")
> cuse <- pivot_wider(cuselong, names_from=cuse, values_from=cases)
> names(cuse)[2:4] <- c("ster", "other", "none")
> cuse

# A tibble: 7 × 4
       ageg  ster other  none
  <dbl+lbl> <dbl> <dbl> <dbl>
1 1 [15-19]     3    61   232
2 2 [20-24]    80   137   400
3 3 [25-29]   216   131   301
4 4 [30-34]   268    76   203
5 5 [35-39]   197    50   188
6 6 [40-44]   150    24   164
7 7 [45-49]    91    10   183

With only one predictor, this example affords limited opportunities for interpreting coefficients, but will allow us to focus on the outcome and the comparisons underlying each type of model.

Updated fall 2022