ASSIGNMENT 2: DIFFERENCE-IN-DIFFERENCES WITH HETEROGENOUS TREATMENT EFFECTS
For any questions where Stata has been used, please include all the generated figures and tables, and make sure to attach the corresponding Stata log file to the end of your assignment
1 “Ban the Box” Policies and Statistical Discrimination
Many jurisdictions across the U.S. have adopted “ban the box” (BTB) policies that prevent employers from asking about job applicants’ criminal records in the early hiring process. Doleac and Hansen (2020) study the effect of BTB policies on employment for young low-skilled men. Read the paper carefully and answer the following questions.
1. Briefly describe the concept of statistical discrimination, using at most 5 sentences.
2. Discuss how BTB policies could exacerbate statistical discrimination against ex-offenders. Do the results of the paper support this hypothesis and why?
3. What are the key assumptions of using the difference-in-differences method (more specifically, the two-way fixed effects estimator) to estimate the causal effect of BTB policies on employment?
4. What is main concern about the validity of the identifying assumption? How do Doleac and Hansen (2020) attempt to deal with the concern?
5. Suppose you now have access to the dataset used by Doleac and Hansen (2020). How would you improve upon the paper in terms of methodology? In fact, the Current Population Survey (CPS) data used in this paper are publicly available data. If you are interested, you can explore and download the data from the IPUMS CPS website (https://cps.ipums.org/cps/).
2 The Effect of Unilateral Divorce on Family Violence
Stevenson and Wolfers (2006) study the effect of unilateral divorce laws on suicide and spousal homicide. In this question, you are asked to carefully read the paper and use the dataset suicide.dta to reproduce some of the results included in the paper and to conduct additional analysis, using alternative estimators.
Please note that this dataset has less information than the original data used by Stevenson and Wolfers (2006). Thus, your results will not be identical to those presented in the paper. However, the lack of additional control variables should not be an issue since Stevenson and Wolfers (2006) show that the inclusion of control variables has little effect on the parameter of interest. Specifically, the dataset (suicide.dta) provides information on state code, year of observation, year of unilateral divorce adoption, suicide mortality rate (suicides per million people, not by gender), per-capita income (as an business cycle indicator), and AFDC cases (as a measure of welfare generosity).
1. Write a concise summary of Stevenson and Wolfers (2006), covering the research question, main contribution(s), data, empirical method, and the key findings. The summary should not be longer than 1 page using a font size of 12 and single spacing.
2. Since unilateral divorce laws were introduced across states in different years, we may not want to use a standard 2 × 2 difference-in-differences (DID) in order to incorporate more groups and years of observation. In Lecture 9, you learnt that in such a setting, the two-way fixed effects (TWFE) regression model is often used. Write down the TWFE regression to estimate the effect of the introduction of unilateral divorce laws on suicide rates. Make sure to clearly specify the meaning of each variable used in the regression equation.
3. Report your TWFE estimates in a well-formatted table. Are your results sensitive to the inclusion of state-level control variables? What is the estimated effect in terms of the percentage change in the suicide mortality rate, compared with the mean before the introduction of unilateral divorce?
Note: (i) You can use the Stata package reghdfe to estimate your regression. The package supports multiple levels of fixed effects and is faster than areg or xtreg. To install the package, open Stata and run ssc install reghdfe. (ii) Please ensure that standard errors are clustered at the state level. This can be done by adding cluster(stfips) as an option when using the reghdfe command. (iii) You can refer to the empirical papers you have read to learn how to format a result table. Ensure that your table includes the key information, such as the dependent variable and independent variables, estimated coefficients and their standard errors, R-squared, the number of observations, etc.
4. Estimate the effect of unilateral divorce on suicide rates over time since the introduction of unilateral divorce using a TWFE regression. Report your results in a well-formatted table. This is similar to the analysis conducted in Table I of Stevenson and Wolfers (2006). (You do not have to translate your results into percent changes.)
5. In Lecture 9, you learnt that an event study analysis can be used to test for causality. Write down the event study regression to estimate treatment effects in each pre- and post-treatment period, using the year before the introduction of unilateral divorce as the omitted group. Make sure to clearly specify the meaning of each variable used in the regression equation.
6. Present your event study results in a well-labelled figure, where the x-axis represents years relative to the introduction of unilateral divorce and the y-axis represents the TWFE estimates of the effect of unilateral divorce on suicide rates together with their 95% confidence intervals. You can refer to Figure I of Stevenson and Wolfers (2006). (Stata command serrbar can help you graph point estimates and confidence intervals.)
7. In Lecture 14, you learnt that while the standard 2 × 2 DID approach estimates the average treatment effect on the treated (ATT) under the common trends assumption, the TWFE estimator often provides a biased estimate of the ATT in the presence of heterogenous treatment effects. Discuss the intuition why the TWFE estimates are often biased.
8. De Chaisemartin and d’Haultfoeuille (2020) show that the TWFE estimators estimate weighted sums of the average treatment effects (ATE) in each group and period with some weights that may be negative. The negative weights are an issue in the presence of heterogenous treatment effects. Are there many negative weights in your TWFE estimator in part 2? You can obtain these weights using the Stata package twowayfeweights, which can be installed by typing ssc install twowayfeweights.
9. De Chaisemartin and d’Haultfoeuille (2020) propose a new estimator, DIDM. In our context, the estimator provides an unbiased estimate of the average of the treatment effect at the time when a state introduced unilateral divorce, across all states that introduced unilateral divorce at some point of time during our period of analysis. What is the DIDM estimate of the effect of unilateral divorce on suicide rates? Is the estimate different from your TWFE estimate obtained in part 3 and why? You can compute the DIDM estimator using the Stata package did multiplegt, which can installed by typing ssc install did multiplegt.
10. These authors also propose another estimator, DIDl, which estimates the effect of having been treated (rather than untreated) for l periods. Compute the dynamic effects of unilateral divorce on suicide rates over time using the DIDl estimator and compare your results with the finding in part 4. You can compute the DIDl estimator using did multiplegt.
11. Both the DIDM and DIDl estimators reply the assumption of common trends between switchers and not-yet switchers. Evaluate the validity of the assumption.