ECON 306 - Homework 3

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

ECON 306 - Homework 3

Don’t lose points - read these instructions carefully!

· Carefully answer each of the questions below.

· You need to show your work, not just the final answer

· Please make sure your work is legible & each problem and step are properly numbered/lettered in order

· You can write or type  your answers for the questions

· Please put your name and PSU ID number at the top of the 1st page

· You may use as many sheets of paper as you need.

· If you did parts by hand:

o Scan your entire document (including the parts you typed and the parts done by hand)

o Save the file on your computer (.PDF is the only acceptable format)

o Merge the separate PDFs into one file using a service such as (PDFMerge!)

§ Failure to submit a single file will result in points being deducted

o Upload it to the CANVAS dropbox for this homework assignment.

· If you typed everything:

o Save the file on your computer (.PDF is the only acceptable format) and then upload it to the CANVAS dropbox for this homework assignment.

o Be sure your assignment is all on one file. Failure to submit a single file will result in points being deducted

· Pages must be scanned in proper order

The following two problems will require a lot of calculations in STATA (or however you opt to execute the calculations). It will generate many pages of output.  Here is how your should organize it. The first pages should contain your answers to all the questions, along with showing any key algebraic equations or explanations you need to use along the way. After that, include a printout of the output from the regressions you executed in support of your answers.  Highlight any numbers in this output that you used in the first section. (To save paper, you may print this section double-side and/or with 2-up format.) Last, include a copy of the DO file that contains the commands you asked STATA to execute. Be sure you organize these in a way that will be clear to the reader.

1. (52 points total, 4 points each part) With this assignment you will find a STATA data file called boston.dta.  For reference, the variables in this file are:

nox = nitric oxides concentration (parts per 10 million)

rm  = average number of rooms per dwelling

age = proportion of owner-occupied units built prior to 1940

dis =  weighted distances to five Boston employment centers

ptratio = pupil-teacher ratio by town

lstat = percent lower status of population

medv = median value of owner-occupied homes (in thousands of dollars)

Open this dataset within STATA (only STATA can open it).  Before you begin answering the following, it’s not a bad idea to ask STATA to summarize the data using the command summarize.  You should also start a log file to store your results.

a.) Run the following regression: 

b.) Hypothesize the sign of the bias, if any, resulting from excluding age from the regression.  Explain your reasoning.  (There is no wrong answer as long as you make a sensible story.)

c.) Use the data to verify (or not) your claim from b).  Break down the bias into its pieces.

d.) Now, run the regression:

e.) At a level of α=.05, for which, if any, values of βi, would you reject the null hypothesis that βi=0?

f.) What is the predicted  medv with nox=0.5, rm=4, age=60, dis=3, ptratio=20, and lstat=10?

g.) Redo (f) but with nox=0.6.  What is the difference in predicted medv between these two communities?  Compare this with the coefficient of nox.

h.) Ceteris Peribus, compared to (f), what is the impact of reducing the pupil-teacher ratio to 18?

i.) What percentage of the variation in medv is explained by the six X-variables?

Now change the measurement of  nox.  Use the ‘gen’ command:

gen noxppm=nox/10

and then use this in place of nox in the regression command

regress medv noxppm rm age dis ptratio lstat

j.) Compare the coefficient, standard error, and t-ratio for noxppm to that of nox.  Interpret the difference between this model and the previous.

k.) Also compare the  and remaining coefficients.  Interpret the difference between this model and the original regression model.

Now change the variable age to newage

gen newage=100-age

and then use this in place of age in the original regression command.  That is, execute:

regress medv nox rm newage dis ptratio lstat

l.) Compare the coefficient, standard error, and t-ratio for newage to that of age.  Interpret the difference between this model and the previous.

m.) Also compare the  and remaining coefficients.  Interpret the difference between this model and the original regression model.

2. (48 points total.  5 points each part, +3 for free.)  For the following problem, use the STATA dataset called crime.dta.  This data set was compiled by Christopher Cornwell and William Trumbull to study factors that influence crime rates.  The data set contains observations for 90 counties in North Carolina for 1981.  The definitions of the variables represented in the data set are:

crmrte=crime rate

prbarr=probability of arrest

prbconv=probability of conviction

prbpris=probability of a prison sentence

avgsen=average sentence in days

polpc=number of police per capita

density=population density

pctmin80=percent minority in 1980

pctymle=percent young males

wmfg=average weekly wage in manufacturing

wcon=average weekly wage in construction

wtuc=average weekly wage in transportation,utilities,and communications

wtrd=average weekly wage in wholesale and retail trade

wfir=average weekly wage in finance,insurance,and real estate

wser=average weekly wage in services

wfed=average weekly wage in federal government

wsta=average weekly wage in state government

wloc=average weekly wage in local government

According to the economic model of crime rates, lower crime rates are associated with better labor markets (higher wages), more police presence and tougher sentences, and lower population density.  We will use this data set to examine these hypotheses. Use a significance level of α=.05 for all hypothesis tests.

a.) Run a regression of crmrte on all of the other variables. Call this Model 1.

b.) Do any t-statistics indicate a variable is not statistically significant?  Which?

c.) Interpret the F-statistic STATA has calculated for Model 1.

d.) Test the hypothesis that the coefficients on wfed and wsta are equal to each other.  Use the t-test method described in the lectures.  What transformation do you need to do here?  Be specific.

e.) Test the hypothesis that the coefficients on wfed, wsta and wloc are all equal to each other.  Do this by writing down the formula for the relevant F-statistic.  Calculate it (by running the appropriate restricted regression) and test the hypothesis. Report these results.  This restricted version of the regression will be called Model 2.

f.) Return to Model 1.  Now test the hypothesis that pctmin and pctymle both equal zero.  Do this by writing down the formula for the relevant F-statistic.  Calculate it (by running the appropriate restricted regression) and test the hypothesis.  Report these results. This restricted version of the regression will be called Model 3.

The model could potentially be simplified by replacing all the wage variables with an

average.  Specifically, let us define 

Generate this variable.

g.) Return to Model 1 and run using avgwage in place of the individual wage variables.  Check the validity of this restriction.  As before, do this by writing down the formula for the relevant F-statistic.  Calculate it (by running the appropriate restricted regression) and test the hypothesis.  Report these results. This restricted version of the regression will be called Model 4.

h.) Let’s focus our attention on the coefficient for the variable polpc.  How does the value of this coefficient change – as well as its statistical significance –as we move from model to model?  To answer this, write down a table containing the results for this coefficient for each of the four models.  In this table, include the coefficient values, the values of the t-statistic (for a hypothesis that the coefficient=0,) and whether you’d reject the hypothesis.

i.) What do your results in the last question imply about the relationship between the number of police and the crime rate.  Are you confident in these results based on the work you have done?  Why or why not?





发表评论

电子邮件地址不会被公开。 必填项已用*标注