Microeconometrics 440.618.51
Problem Set 1
This problem set covers ordinary least squares, heteroskedasticity, and instrumental variables. You will need to read in the following ascii data (.raw or .asc) from the Data Sets for problem sets module in Canvas: gpa2, qreg0902, MROZ.
1. You are interested in studying the extent to which factors observable at the point of applica-tion affect college academic performance. Assume E(y|X) = Xβ where
yi =student i’s college GPA
x1i=student i’s size of high school graduating class (in hundreds of students)
x2i=student i’s academic percentile in high school class,
x3i=student i’s SAT score
x4i=indicator equal to 1 if student i is female
x5i=indicator equal to 1 if student i is an athlete
(a) Interpret β1.
(b) Interpret β5.
(c) Using the data set gpa2 (on the Canvas site), estimate βˆ by ordinary least squares.
(d) Interpret βˆ 1.
(e) Interpret βˆ 5.
(f) Test the null that the mean college academic performance of athletes is equivalent to that of non-athletes, conditional on other factors, at the 5% significance level.
(g) Drop the SAT variable from the model specification and retest the null of equivalence. What does this suggest, if anything?
(h) Do you think that it is likely that the variance of GPA conditional on X is homoskedastic? If not, how would you suggest adjusting your estimators?
2. Read in the qreg0902 data set (on the Canvas site in the Dataset Module). Where the variables are sex (1:M,2:F) age educyr98 farm (1 if farm, 0 otherwise) urban98 (1 if urban, 0 otherwise) hhsize (household size) lhhexp1 (ln(total expenditure) lhhex12m(ln(medical expenditures if positive, . otherwise)) lnrlfood.
(a) Generate level variables for expenditure: total=exp(lhhexp1) and med=exp(lhhex12m). Perform ordinary least squares regression of medical expenditure on a constant and total expenditure. You should obtain a slope estimate of 0.0938.
(b) In theory, would you expect the errors in this regression to be homoskedastic or het-eroskedastic? Explain.
(c) Perform a statistical test of the null of homoskedasticity (e.g. Breusch Pagan test).
(d) Plot the o.l.s. squared residuals against total expenditures to examine visually for the presence of heteroskedasticity.
(e) Estimate the heteroskedasticity-robust standard errors of βˆ ols.
(f) Perform Weighted Least Squares regression of med on a constant and total under the assumption that the error has variance σi 2 = σ 2 total2 .
(g) Compare the default o.l.s. standard errors with the heteroskedasticity-robust standard errors and the w.l.s. standard errors.
3. Kleck and Patterson (1993) studied the effect of gun controls laws on city-level violent crime rate. They have data on gun control laws, unemployment rate, population, percent of the population that reports as black, number of people aged 18 to 21 years old, etc. and they start with the model
violent = β0 + β1guncontrol + β2unemp + β3pop + β4percblack + β5age18to21 + . . . + u
(a) Explain whether or not you think it is justified to assume that Cov(guncontrol, u) = 0, that is, what the text would refer to as guncontrol is an ”exogenous” variable?
(b) Researchers have used variables such as z1 = number of National Rifle Association members in the city, z2 = number of subscribers to gun magazines and z3 = state hunting license rate as instrumental variables for guncontrol. Referring to the two necessary conditions, do you believe any of these to be valid instruments? Explain.
4. Read in the MROZ data set (again available for download in Canvas). The MROZ.des file does not include data, rather it describes the variables in the data set.Restrict the data set to only working women (lwage not missing).
(a) Perform the ordinary least squares regression of ln(wage) on a constant, experience, experience squared, and education.
(b) Interpret the o.l.s. coefficient estimate on education.
(c) What theoretical reason(s) might education be correlated with the disturbance? How would the correlation impact the properties of βˆ ols?
(d) Compute the heteroskedasticity robust standard errors. Are they much different than the o.l.s. standard errors?
(e) Under which conditions would mother’s education, father’s education, and husband’s ed-ucation be valid instruments for the woman’s education? Do you think those conditions hold?
(f) Re-estimate the βs using Two Stage Least Squares with those 3 instruments. How do βˆols and βˆ2sls compare?
(g) Test for the ”endogeneity” of education.
(h) Use the first stage regression results to test if your instruments are partially correlated with what they are instrumenting for.
(i) Using the Sargen test, do you accept or reject the null of valid instruments?