Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
Problem Set 1
R Basics and Causality
(Due date:25 Feb 2025 8:30 PM)
Instructions:
·Submit the one of the following set of documents via Canvas:1. R script and/or word document
- You can use the comments in R script to answer the question.- If you want to use .docx document, make sure you also attach the R script you used to conduct the analysis.
2.Rmarkdown and PDF document (we will cover Rmarkdown during the week of 19th)
1. Bias in Self-reported Turnout
Surveys are frequently used to measure political behavior such as voter turnout, but some researchers are concerned about the accuracy of self-reports. In particular, they worry about possible social desirability bias where in post-election surveys, respondents who did not vote in an election lie about not having voted because they may feel that they should have voted. Is such a bias present in the American National Election Studies (ANES)? The ANES is a nation-wide survey that has been conducted for every election since 1948. The ANES conducts face-to-face interviews with a nationally representative sample of adults. The table below displays the names and descriptions of variables in the turnout .csv data file.
Variable |
Description |
year ANES VEP VAP total felons noncitizens overseas osvoters |
Election year ANES estimated turnout (percentage) Voting Eligible Population (in thousands) Voting Age Population (in thousands) Total ballots cast for highest office (in thousands) Total ineligible felons (in thousands) Total non-citizens (in thousands) Total eligible overseas voters (in thousands) Total ballots counted by overseas voters (in thousands) |
Question 1
Load the data into R and check the dimensions of the data. Also, obtain a summary of the data. How many observations are there? What is the range of years covered in this data set?Question 2
Calculate the turnout rate based on the voting age population or VAP. Note that for this data set, we must add the total number of eligible overseas voters since the VAPvariable does not include these individuals in the count. Next, calculate the turnout rate using the voting eligible population or VEP. Finally, the data also includes ANES estimates of turnout rate (note it may be in different scale from the VAPand VEP turnout you just calculated.)Compare three turnout rates, what difference do you observe?
Question 3
Compute the difference betweenVAP and ANES estimates of turnout rate. How big is the difference on average? What is the range of the difference? Conduct the same comparison for the VEPand ANES estimates of voter turnout. Briefly comment on the results.Question 4
Compare the VEP turnout rate with the ANES turnout rate separately for presidential elections and midterm elections. Note that the data set excludes the year 2006. Does the bias of the ANES vary across election types?Question 5
Divide the data into half by election years such that you subset the data into two periods.Calculate the difference between the VEP turnout rate and the ANES turnout rate separately for each period. Has the bias of the ANES increased over time?Question 6*
(Bonus question: This problem is optional. Any points earned on this problem can be applied to lost points on other parts of the problem set. You cannot earn more than the marimum score on the problem set.)
The ANES does not interview overseas voters and prisoners.Calculate an adjustment to the 2008 VAP turnout rate. Begin by subtracting the total number of ineligible felons and non-citizens from the VAP to calculate an adjusted VAP. Next, calculate an adjusted VAP tumout rate, taking care to subtract the number of overseas ballots counted from the total ballots in 2008.Compare the adjusted VAP turnout with the unadjusted VAP, VEP, and the ANES turnout rate. Briefly discuss the results.
2. Effect of DemographicChange on Exclusionary Attitudes
A researcher conducted a randomized field experiment assessing the extent to which individuals living in suburban communities around Boston,Massachusetts, and their views were affected by exposure to demographic change.
This exercise is based on: Enos,R.D.2014.“Causal Effect of Intergroup Contact on Exclusionary Attitudes." Proceedings of the National Academy of Sciences 111(10): 3699-3704.
Subjects in the experiment were individuals riding on the commuter rail line and overwhelmingly white. Every morning, multiple trains pass through various stations in suburban communities that were used for this study. For pairs of trains leaving the same station at roughly the same time, one was randomly assigned to receive the treatment and one was designated as a control. By doing so all the benefits of randomization apply for this dataset.
The treatment in this experiment was the presence of two native Spanish-speaking ‘confederates’ (a term used in experiments to indicate that these individuals worked for the researcher, unbeknownst to the subjects) on the platform each morning prior to the train's arrival. The presence of these confederates, who would appear as Hispanic foreigners to the subjects, was intended to simulate the kind of demographic change anticipated for the United States in coming years. For those individuals in the control group, no such confederates were present on the platform. The treatment was administered for 10 days. Participants were asked questions related to immigration policy both before the experiment started and after the experiment had ended. The names and descriptions of variables in the data set boston.csv are:
Variable |
Description |
age male
income |
Age of individual at time of experiment Sex of individual, male (1) or female (0)
Income group in dollars (not exact income) |
Question 1
The benefit of randomly assigning individuals to the treatment or control groups is that the two groups should be similar, on average, in terms of their covariates. This is referred to as ‘covariate balance.' Show that the treatment and control groups are balanced with respect to the income variable (income) by comparing its distribution between those in the treatment group and those in the control group.Also, compare the proportion of males (male) in the treatment and control groups. Interpret these two numbers.Question 2
Individuals in the experiment were asked a series of questions both at the beginning and the end of the experiment. One such question was “Do you think the number of immigrants from Mexico who are permitted to come to the United States to live should be increased, left the same, or decreased?"The response to this question prior to the experiment is in the variable numberim.pre. The response to this question after the experiment is in the variable numberim. post. In both cases the variable is coded on a l -5 scale. Responses with values of 1 are inclusionary ('pro-immigration') and responses with values of 5 are exclusionary (anti-immigration'). Compute the average treatment effect on the change in attitudes about immigration. That is, how does the mean change in attitudes about immigration policy for those in the control group compare to those in the treatment group. Interpret the result.Question 3
Does having attended college influence the effect of being exposed to ‘outsiders’on exclusionary attitudes? Another way to ask the same question is this: is there evidence of a differential impact of treatment, conditional on attending college versus not attending college? Calculate the necessary quantities to answer this question and interpret the results. Consider the average treatment effect for those who attended college and then those who did not.Question 4*
(Bonus question: This problem is optional. Any points earned on this problem can be applied to lost points on other parts of the problem set. You cannot earn more than the maximum score on the problem set.)
Repeat the same analysis as in the previous question but this time with respect to age and ideology.For age, divide the data based on its quartile and compute the average treatment effect within each of the resulting four groups. For ideology, compute the average treatment effect within each value. What patterns do you observe? Give a brief substantive interpretation of the results.