Interpreting Binary Outcome Models
Abadie (2003) is interested in the effect of being eligible for a 401(k) plan on an individual’s decision to save through an Individual Retirement Account (IRA). The idea is to see if eligibility for a workplace retirement plan “crowds out” other forms of retirement savings.1
Estimates of 401k Eligibility on Saving through an IRA
| |
LPM |
Logit |
Probit |
| * p < 0.1, ** p < 0.05, *** p < 0.01 |
| 401(k) Eligibility |
0.057*** |
0.355*** |
0.206*** |
|
(0.010) |
(0.058) |
(0.034) |
| Income (000s) |
0.006*** |
0.033*** |
0.019*** |
|
(0.000) |
(0.001) |
(0.001) |
| Married |
-0.017* |
-0.072 |
-0.042 |
|
(0.009) |
(0.065) |
(0.037) |
| Male |
0.007 |
0.035 |
0.021 |
|
(0.011) |
(0.074) |
(0.042) |
| Age |
0.009*** |
0.054*** |
0.031*** |
|
(0.000) |
(0.003) |
(0.002) |
| Num.Obs. |
9275 |
9275 |
9275 |
| R2 |
0.177 |
0.156 |
0.157 |
Interpreting Binary Outcome Models
Compare the estimated coefficients on p401k across the three models. Do they agree on the direction and statistical significance of the effect of 401(k) eligibility on IRA participation?
Compare the estimated coefficients on inc across the three models. Do they agree on the direction and statistical significance of the effect of income on IRA participation?
Interpet the coefficient on p401k from the LPM. What does this coefficient tell you about the effect of 401(k) eligibility on the probability of IRA participation?
Are the R-squared values comparable across the three models? Why or why not?
Model Estimation
For this question, you will use the Bertrand and Mullainathan (2004) dataset, lakisha_aer.dta, which was used for the examples in the lecture.1
You are interested in the effect of perceived race on the probability of receiving a callback for a job interview. The key variables are call (1 if callback, 0 otherwise) and race (‘b’ for black-sounding name, ‘w’ for white-sounding name).
- Estimate a Linear Probability Model (LPM) regressing
call on race.
- Estimate a Logit model with the same variables.
- Estimate a Probit model with the same variables.
For each model, report the estimated coefficient on the race variable and its p-value. Do the models agree on the direction and statistical significance of the effect?
Marginal Effects vs. Coefficients
The coefficients from Logit and Probit models are not directly interpretable as marginal effects.
- Calculate the Average Marginal Effect (AME) of the
race variable from your Logit and Probit models in the previous question.
- Interpret the AME from the Probit model. What does this number tell you about the difference in callback probabilities between resumes with white-sounding and black-sounding names?
- How do the AMEs from the Logit and Probit models compare to each other? How do they compare to the coefficient on
race from the LPM in Question 4? Discuss your findings.
The Pros and Cons of Simplicity
The lecture introduced the Linear Probability Model (LPM) as a straightforward way to handle binary outcomes using OLS.
- What are the two main advantages of the LPM, particularly concerning estimation and interpretation?
- What are its two primary shortcomings, as discussed in the lecture? Explain why each is a problem.
- One of these shortcomings can be partially addressed using robust standard errors. Which one is it, and why does this fix work?
The Latent Variable Framework
Both Probit and Logit models are motivated by an underlying latent variable, \(y^*\).
- In your own words, explain the concept of a latent variable \(y^*\) and how it relates to the observed binary outcome \(y\).
- The lecture states that \(P(y_i=1) = P(\epsilon_i > -X_i'\beta)\). What is the final step needed to get from this expression to the specific functional forms for the Probit and Logit models? What key assumption distinguishes the two models?
Understanding Maximum Likelihood
Probit and Logit models are estimated using Maximum Likelihood Estimation (MLE), not OLS.
- Explain the fundamental goal of MLE. How does its objective differ from the objective of OLS (which minimizes the sum of squared residuals)?
- Using the “biased coin” example from the lecture (observing 7 heads in 10 flips), explain how you would construct the likelihood function. What value of p (the probability of heads) does MLE tell us is the best estimate, and why is this intuitive?
The Complexity of Interaction Terms
Suppose you were to estimate a Probit model:
\[
P(\text{call}=1 | X) = \Phi(\beta_0 + \beta_1 \text{race}_i + \beta_2 \text{female}_i + \beta_3 (\text{race}_i \times \text{female}_i))
\]
Explain why you cannot simply look at the sign and significance of \(\hat{\beta}_3\) to determine the sign and significance of the interaction effect on the probability of receiving a callback. How does this differ fundamentally from the interpretation of an interaction term in an LPM?
Censoring and the Tobit Model
A health economist is studying individual annual out-of-pocket spending on prescription medication. A significant fraction of the sample (30%) has zero expenditure because they did not purchase any medication.
- The researcher considers dropping the zero-expenditure observations and running an OLS regression on the remaining positive values. Why would this lead to biased estimates?
- The researcher then considers running OLS on the full sample, including the zeros. Why is this also problematic for estimating the relationship between income and medication spending?
- Explain why the Tobit model is designed to handle this specific type of data problem.
Estimating a Tobit Model
This question uses the Mroz dataset, which contains data on the labor force participation of married women in 1975.1
The primary variable of interest is hours, representing the number of hours the woman worked in 1975. A significant number of women in the sample did not work, resulting in hours = 0. We want to model the factors that determine hours worked.
Load the Mroz dataset. Create a histogram or frequency table for the hours variable. Based on the distribution you observe, explain precisely why a Tobit model is a more appropriate choice than a standard OLS regression for this research question.
Estimate a Tobit model where hours is the dependent variable. Use the following as independent variables: educ (years of education), exper (actual labor market experience), nwifeinc (non-wife household income, in thousands), and kidslt6(number of children less than 6 years old). Report the estimated coefficients and their statistical significance.
Focus on the estimated coefficient for kidslt6. Provide a careful interpretation of this coefficient. What does this coefficient tell you about the relationship between having young children and the latent propensity or desired hours of work?
Now estimate an OLS model. Compare your results with the Tobit model. Is this a valid comparison?