5SSMN932 Introduction to Econometrics

Examination 2023/24

Module Code and Title: 5SSMN932 Introduction to Econometrics

Examination Period: Exam Period 1, January 2024

SECTION A

Answer ANY FOUR questions from this section. Section A carries 20 marks

1. (5 marksConsider the following model of household consumption, Ci:

Ci  = β0 + β1 · Yi + β2 · Yi2 + ui,

where Yi  denotes the annual income of household i, and ui  is the error term.  As- sume we have a random sample of 2,024 households.

What does the error term u represent?  (You can use examples to illustrate your explanation.) Why do we include the quadratic term Y2 in this model?

2. (5 marks) Explain the difference(s) between random sampling and random treat- ment.  In your answer, discuss the implications of these two settings for an OLS estimator.

3. (5 marks) Consider the following estimation of household consumption, C, mea-

sured in 10,000 US dollars, using data from 2,034 randomly selected households:

C(ˆ) = 1.5 + 0.56 · Y − 0.12 · Y2 + 0.013 · Y3,

where Y denotes household income, measured in 10,000 dollars.

What is the marginal propensity to consumption of a household that earns $100,000 per year? Is it large or small?

4. (5 marks) Explain what heteroskedasticity means and why we should be worried about heteroskedasticity in practice.  You may want to refer to the consumption- income regression in either Question 1 or Question 3.

5. (5 marks) Discuss two popular statistical criteria that you would use to choose the preferred specification (model) from a selection of models.

SECTION B

Answer Any TWO questions from this section. Section B carries 60 marks.

1. Consider a dataset containing the following variables: government expenditure on healthcare (HealthE i), gross domestic product (GDPi), population (Popi ),and cap- ital stock (Capitali ) for 153 countries in 2019.  Economic theory suggests that gov- ernment per capita expenditure on Health Care has a positive effect on per capita GDPi .

You estimate the following regression:

\

perGDPi  = 0.445 + 0.443 perHealthEi + 0.914 perCapitali,

(0.022)      (0.252)                                 (0.121)

where perGDPi   =  P(G)op i(DP)i , perHealthEi   =  HP(ea)o(l)p i(th)Ei, and perCapitali   =  CPop i(apita)li , and i

denotes a particular country.

(a) (10 marksDiscuss whether we can interpret the OLS estimate for β2  causally.  In your answer, clearly define what unbiasedness is and the direction of bias.

(b) (10 marksThe dataset includes a continuous variable capturing the percent- age of the cabinet members in the country i holding a Ph.D. degree, PhD i .  A researcher wants to use this variable as an instrumental variable for HealthE i . Explain the intuition behind this approach and the relevant conditions. Do you think this suggestion is sound? Explain.

(c) (10 marks) The dataset also contains a dummy variable Democratic i  which equals 1 if country i is democratic, 0 otherwise.  Assuming homoskedasticity, discuss how you could test whether the effect of government expenditure on Health Care per capita is statistically higher for democratic countries. Clearly indicate the regression you would run and the test you would use.

2. A Logit model is used to explain the outcome of mortgage application amongst a random sample of 2,000 young families in Britain, 15% of whom belong to the immigration group (the lead applicant is on a visa to remain in the UK). There are 600 successful mortgage applications and 1400 denied applications, with an average salary of £33,000 and an average current debt of £5,100. The regression estimates are as follows:

i    =   L ✓ − 0(0) − .0(1) salaryi + 4(2) immigrationi + 0(0) debt i◆ (Logit),

where L(z) = exp(z)/(1+exp(z)) is the logistic function. The usual standard errors for the Logit model are in parentheses.

The variable denial i  equals 1 if the loan application was denied, and 0 otherwise. The variables salaryi  and debt i  are the total annual salary and the current debt by family i in £10,000. The variable immigrationi  equal to 1 if the applicant belongs to an immigration group and 0 otherwise.

(a) (10 marks) Describe the estimator underlying the logit model and discuss its properties. [Note: Clearly indicate the likelihood function used, but a detailed derivation of the estimator is not expected].

(b) (10 marks) Does immigration statistically significantly impact the success of the mortgage application? Do you think the sign of the coefficient is as expected?

(c) (10 marks) For an applicant with an average income and an average level of debt, what is the effect of being in the immigration group on the denial prob- ability?  What is the predicted probability of denial for an immigrant with a £60,000 of income and £0 debt?

3. A team examines the determinants of restaurant reviews on Yelp, a popular web- site for table reservations and restaurant reviews in the US. They have annual data on 4,562 restaurants in 40 US states, from 2010 to 2020, with information on the percentage of positive reviews for restaurant i in year t, Review it; the total running expenses and overhead cost expenses it  in millions of dollar; and whether the head chef has ever worked in a Michelin-star, an award for a few selective establishments with exceptional services, restaurant before, Michelin it.  They obtain the following estimation, using the logarithm of and expenses it .

Rit    =   15.23 + 1.72 log(expenses it)+ 25.01Michelin it + StateFEj+ TimeFEt,

n = 4562,        R2  = .628.

where the estimates for the state fixed effects (StateFEj) and year fixed effects (TimeFEi) are not reported here to save space.  Clustered standard errors at the city level are reported in parentheses.

(a) (10 marksExplain why it is necessary to include the state and year fixed effect. In your answer, clearly give one example of the factors that the time and state fixed effects are capturing and discuss the difference between the two fixed effects.

(b) (10 marksClearly interpret the coefficient on log(expenses it). Explain whether or not the researcher should further include restaurant-fixed effects in the re- gression. Why did the team use clustered standard errors?

(c) (10 marks) Describe a statistical test to examine whether a Michelin-starred head chef is more able to utilise a restaurant’s expenses to improve its reviews. Describe any necessary new regressions or information.

SECTION C

Answer BOTH (two) questions from this section. Section C carries 20 marks.

1. (20 marksConsider the following OLS estimation of a model to explain the labour demand of 500 large UK companies after the Covid pandemic using financial data in 2023, all variables are calculated at the end of the year:

hirest([)   =   0.96 + 0.74 log (profitt ) − 0.79 log (researcht) ,

(1.02)      (0.27)                              (0.32)

n = 500, R2  = 0.39.

where hires is the number of newly recruited employees in thousands (treated as a continuous variable), profitt is the profit before tax in millions of GBP, researcht  is the expenditure on research and development in millions of GBP. The robust stan- dard errors are reported in parentheses.

Describe a test to examine whether the effect of having more profit on labour de- mand is offset by the effect of spending more on research and development. Make clear any additional regression or information required.

2. A researcher estimated the following regression:

hogei    =   0.49 − 2.14 minorityi + 1.49 log (experiencei ) ,

(0.12)      (2.57)                         (1.32)

n = 1500, R2 = 0.49.

where hourlywage is the hourly wage per hour, measured in GDP per hour, minority

is a dummy variable of whether the worker is of a minority background, and log(experience) is the logarithm of the number of years of experience.  The robust standard errors are reported in parentheses.  The regression uses a representative sample from the 2000 UK census.

(a) (10 marksCan we interpret the coefficient estimate on minority as the causal evidence of racial discrimination in the labour market? In your answer, clearly interpret the estimate and comment on its sign.(b) (10 marksA colleague suggests including a variable Outof College which cap- tures how many years since the worker graduated from their highest degree. Is this suggestion sound? Explain.

发表评论

电子邮件地址不会被公开。 必填项已用*标注