Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
ECMT1020 Introduction to Econometrics
Semester 1, 2024
Group Assignment
Due: 11.59PM Friday 17 May 2024
Instructions
1. This group assignment accounts for 15% of your inal grade. You can self sign up a group to complete this assignment. The maximum group size is 2. The marking of the group assignment is based on the inal submission of the group, and group members will receive the same mark for this assignment.
2. There are 12 questions in this assignment and the full mark of the assignment is 55. The breakdown marks are indicated in the questions. Please attempt all questions.
3. This group assignment entails the use of econometric models and statistical tools in economic application. You will use statistical software to analyze a cross-sectional data set containing annual household expenditure on categorized expenditure re- ported in 2013.
4. The dataset your group will use is in the Excel spreadsheet CES#.xlsx, where # is the last digit of the sum of the last digits of group members’ University of Sydney SIDs. For example, student A and student B form a group. The last digit of student A’s SID is 3 and the last digit of student B’s SID is 8, then 3 + 8 = 11 and the last digit of 11 is 1. So, this group of student A and student B should use data set CES1.
5. Please use your assigned data set to answer the questions and write your data set number and the SIDs of group members on the front page of your work. Using the wrong data set will be reviewed as a potential case of Academic Dishonesty.
6. In your submitted work, please round all numerical answers to 2 decimal places if necessary. When you are asked to “perform a test”, you should write down explicitly the null hypothesis of the test, and state clearly how you make testing decisions and conclusions. Please carry out all tests using a 5% level of signiicance.
7. If you are asked to make a plot, please make sure you have a proper title, x-axis label and y-axis label on your igures.
8. You should include Stata procedures and outputs in your answers, and your own interpretations and explanations are necessary for earning marks. Please type your answer in a document. We do not accept handwritten solutions.
9. When answering the questions, please keep your statements concise as well as ac-curate. Excessively long responses indicate a lack of understanding and will be penalized accordingly.
10. Please submit a pdf fiile named CES# SID1 SID2.pdf where # is your assigned data set number, and SID1 and SID2 are 9-digit SIDs of the group members. Do not put your names in your submission. Do not include a cover sheet.
11. Submit one pdfile through Turnitin under the Canvas module “Assignment”. Late submission is subject to a penalty of 5% of total 55 marks, which is 2.75 marks, per calendar day. Work submitted more than 10 calendar days after the due date will receive a mark of zero. There are in accordance with 7A in the University Assessment Procedures 2011.
Data Description
Your assigned data set is a subset of the Consumer Expenditure Survey (CES2013) data set. The description of the data set and contained variables can be found in Appendix B on pp. 570–572 of the textbook (also provided in a separate pdf ile).
In the data set, there are 23 variables of categorical household expenditure, such as FDHO, indicating food and nonalcoholic beverages consumed at home, and HOUS, indicating housing expenditure.
The category of household expenditure you will be interested in for answering the following questions (variable Y in the questions) depends on which data set you use. Find your variable Y from the following table.
Data set |
Expenditure variable (Y ) |
CES0 CES1 CES2 CES3 CES4 CES5 CES6 CES7 CES8 CES9 |
DOM EDUC ELEC FURN GASO HEAL HOUS LIFE READ TOB |
Questions
In the following questions, the variable Y is an expenditure variable that you will focus on in the analysis. Please irst determine your variable Y based on your assigned data set and the above table, and then replace the variable Y in all the questions by the corresponding expenditure variable name.
1. (5pt) In your data set, the variable SIZE is the number of persons in the household. Make a scatter plot of Y (on the y-axis) and SIZE (on the x-axis), and it a simple linear regression of Y on SIZE. Please write down theftted regression and carefully interpret the regression intercept and slope.
2. (5pt) Is the slope coefficient in your itted model in Question 1 signiicantly diferent from zero at 5% level? Please explain how you could perform a hypothesis test to draw your conclusion. Be explicit about your null hypothesis and explain in detail how you would make your testing decision by reading either
(i) the test statistic, or
(ii) the p-value of the test, or
(iii) the conidence interval
in your regression output.
3. (5pt) Please explain why methods (i), (ii) and (iii) are equivalent for making your testing decision in Question 2. In your explanation, you should be explicit about how the p-value of the test is deined and how the conidence interval is constructed.
4. (5pt) Deine a new variable LGY as the log transformation of Y. Make a scatter plot of LGY (on the y-axis) and SIZE (on the x-axis), and it a simple linear regression of LGY on SIZE. Please write down theftted regression and carefully interpret the regression coefficients. Please also explain why the interpretation of the regression coefficients here is diferent from that in Question 1.
5. (2pt) What is the itted relationship between Y and SIZE based on the itted regression in Question 4?
6. (5pt) Use the Box and Cox procedure (Steps 1–3) described on p. 211 of the textbook to select between the model in Question 1 and the model in Question 4.
7. (6pt) The variable EXP in your data set is the total household expenditure in US dollars. Make a scatter plot of Y (on the y-axis) and EXP (on the x-axis), and it a multiple linear regression of Y on both SIZE and EXP. Please write down theftted regression and carefully interpret the regression coefficients. Perform an F test of the joint signiicance of this regression model.
8. (4pt) Please compare the slope coefficients of SIZE in the itted model in Question 1 and that in the itted model in Question 7. Explain where the diference may come from. [Hint: What is the sample correlation of SIZE and EXP in your data set? ]
9. (5pt) Please explain how you could obtain the same coefficient of EXP in the multi- ple regression in Question 7 by using a simple “purged regression”. Implement your procedure and show the results are matched.
10. (5pt) Deine a new variable LGEXP as the log transformation of EXP. Make a scatter plot of LGY (on the y-axis) and LGEXP (on the x-axis), and it a multiple linear regression of LGY on both SIZE and LGEXP. Please write down the ftted regression and carefully interpret the regression coefficients.
11. (3pt) What is the itted relationship among Y, SIZE and EXP based on the itted model in Question 10?
12. (5pt) Variable REFMS in your data set is coded as 1 if the reference person in the household (usually the head of the household) is married, and it is coded greater than 1 for other marital status. Deine a dummy variable MAR which takes value 1 if REFMS = 1, and takes value 0 if REFMS > 1. How would you modify the model in Question 10 to test if the marital status (married or not) of the head of the household has an efect on elasticity of expenditure on the category of your interest with respect to total household expenditure? Please implement your method and interpret your result.