2. Linear Models for Continuous Data
2.1. Introduction to Linear Models
2.2. Estimation of the Parameters
2.3. Tests of Hypotheses
2.4. Simple Linear Regression
2.5. Multiple Linear Regression
2.6. One-Way Analysis of Variance
2.7. Two-Way Analysis of Variance
2.8. Analysis of Covariance Models
2.9. Regression Diagnostics
2.10. Transforming the Data
3. Logit Models for Binary Data
3.1. Introduction to Logistic Regression
3.2. Estimation and Hypothesis Testing
3.3. The Comparison of Two Groups
3.4. The Comparison of Several Groups
3.5. Models With Two Predictors
3.6. Multi-factor Models: Model Selection
3.7. Other Choices of Link
3.8. Regression Diagnostics for Binary Data
4. Poisson Models for Count Data
4.1. Introduction to Poisson Regression
4.2. Estimation and Testing
4.3. A Model for Heteroscedastic Counts
4.A. Models for Over-Dispersed Count Data
6. Multinomial Response Models
6.1. The Nature of Multinomial Data
6.2. The Multinomial Logit Model
6.3. The Conditional Logit Model
6.4. The Hierarchical Logit Model
6.5. Models for Ordinal Response Data
7. Survival Models
7.1. The Hazard and Survival Functions
7.2. Censoring and The Likelihood Function
7.3. Approaches to Survival Modeling
7.4. The Piece-Wise Exponential Model
7.5. Infant and Child Mortality in Colombia
7.6. Discrete Time Models
A. Review of Likelihood Theory
A.1. Maximum Likelihood Estimation
A.2. Tests of Hypotheses
B. Generalized Linear Model Theory
B.1. The Model
B.2. Maximum Likelihood Estimation
B.3. Tests of Hypotheses
B.4. Binomial Errors and Link Logit
B.5. Poisson Errors and Link Log
The log-linear Poisson model described in the previous section is a generalized linear model with Poisson error and link log. Maximum likelihood estimation and testing follows immediately from the general results in Appendix B. In this section we review a few key results.
The likelihood function for
where
It is interesting to note that the log is the canonical link
for the Poisson distribution. Taking derivatives of the log-likelihood
function with respect to the elements of
where
To understand equation 4.3 it helps to consider a couple of
special cases. If the model includes a constant, then one of
the columns of the model matrix
As a second example suppose the model includes a discrete factor
represented by a series of dummy variables taking the value one
for observations at a given level of the factor and zero otherwise.
Multiplying this dummy variable by the response vector
This result generalizes to higher order terms. Suppose we
entertain models with two discrete factors, say
In general, however, we will use the iteratively-reweighted
least squares (IRLS) algorithm discussed in Appendix B.
For Poisson data with link log, the working dependent variable
and the diagonal matrix
where
Initial values can be obtained by applying the link to the data, that is taking the log of the response, and regressing it on the predictors using OLS. To avoid problems with counts of 0, one can add a small constant to all responses. The procedure usually converges in a few iterations.
A measure of discrepancy between observed and fitted values is the deviance. In Appendix B we show that for Poisson responses the deviance takes the form
The first term is identical to the binomial deviance, representing ‘twice a sum of observed times log of observed over fitted’. The second term, a sum of differences between observed and fitted values, is usually zero, because m.l.e.’s in Poisson models have the property of reproducing marginal totals, as noted above.
For large samples the distribution of the deviance is approximately
a chi-squared with
An alternative measure of goodness of fit is Pearson’s chi-squared statistic, which is defined as
The numerator is the squared difference between observed and fitted values, and the denominator is the variance of the observed value. The Pearson statistic has the same form for Poisson and binomial data, namely a ‘sum of squared observed minus expected over expected’.
In large samples the distribution of Pearson’s statistic is
also approximately chi-squared with
Likelihood ratio tests for log-linear models can easily be constructed in terms of deviances, just as we did in logistic regression models. In general, the difference in deviances between two nested models has approximately in large samples a chi-squared distribution with degrees of freedom equal to the difference in the number of parameters between the models, under the assumption that the smaller model is correct.
One can also construct Wald tests as we have done before,
based on the fact that the maximum likelihood estimator