ECOM20001: Econometrics 1

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

ECOM20001: Econometrics 1

Assignment 1

Student Information

To receive an assignment grade, you must fill out the information in this table and include this table on the front cover page for your assignment. Only students whose names and student ID numbers are included on the cover page will receive marks for the assignment. Groups of up to 3 students are allowed.

Name

Student ID Number

Sally Probability

422552

Xiaosong Statistics

653223

Ipsa Regression

294480

Due Date and Weight

• Submit via LMS by 5:00 pm on 11 April 2025

•  See the subject syllabus on the LMS for late assignment penalties.

•  This assignment is worth 5% of your final mark in ECOM20001.

• There are 40 marks in total.

What You Must Submit via the LMS

• Assignment answers no more than 8 A4 pages with 12-point font.

5 marks will be deducted if your answers exceed 8 A4 pages.

•  The R code that generates your results. Specifically, copy and paste your R

code in an Appendix at the end of your assignment document (e.g., in the .docx file) so it can be viewed and tested by markers. The R code Appendix does not count toward your 8-page answer limit. You may alter and shrink the R code font to less than a 12-point font so that it is easier to read. 2 marks will be deducted if you do not include your R code.

Additional Instructions

•  You may submit this assignment in groups of up to 3. Students in a group are allowed to be from different tutorials. You must have registered your group by the group registration deadline of 4 April to submit as a group.

•  You must complete the assignment in no more than 8 A4 pages with 12-point Arial, Times New Roman, Helvetica, Cambria, or Calibri font. The assignment cover page does not count toward the 8 A4 page limit.

•  To save time, you may copy RStudio output directly into your answers in reporting empirical results. You are also free to create your better-formatted tables based on your RStudio output, which is, of course, good practice in learning how to present empirical results.

•  Figures may also be copied and pasted directly into your assignment answers.

They may be scaled down in size to meet the 8-page limit, but please ensure that your figures are readable. If they are not, marks will be deducted.

•  Marks will be deducted if interpretations of results are incorrect, imprecise,

unclear, or not well-scaled. Similarly, marks will be deducted if figures or tables  are incorrect, unclear, not properly labelled, not well-scaled, or missing legends.

•  When in doubt, work with 3 digits past the decimal throughout.

•  This R code in the Appendix at the end of your assignment (as discussed on the previous page) must be commented on and easy for the subject tutors to follow. If the code is not well commented and easy to follow, marks will be deducted.

Commenting and code clarity must be at the level of tutorial code, or marks will be deducted.

•  Students with a genuine reason for not being able to submit the assignment on time can apply for special consideration to have the assignment mark transferred to the exam at the following link:

• https://students.unimelb.edu.au/admin/special/

Getting Started

Please create an Assignment1 folder on your computer, go to the LMS site for   ECOM 20001, and download the following data file into the Assignment1 folder:

•  movie_data.csv

This dataset contains the following three variables:

•  box_office_revenue: Total movie revenue at theaters (in millions of dollars)

•  movie_budget: Production budget for the movie (in millions of dollars)

•  audience_score: Average audience rating on Rotten Tomatoes (0-100 scale)

Data summary

This dataset contains information on the revenue generated by movies at box offices globally, with additional information on the production budget for each movie and the rating of the movie as judged by audience members.

About the Assignment

In this assignment, we will investigate how movie revenue is related to the production cost for the movie. The role of high production budget (or input cost) and the quality of the output (each movie’s audience score) may both factor into the revenue generated by the movie. You will investigate and quantify each of these factors, and the final question prompts you to use the information you generate to inform business investment decisions for a hypothetical movie production studio.

Questions

1. (3 marks) Report summary statistics (number of observations, mean, standard deviation, median, and the interquartile range, i.e., 25th percentile and 75th percentile) for box_office_revenue, movie_budget, and audience_score.

Interpret the means in words to characterise the average observation in the    sample. Comment briefly on the median and interquartile range and what this might imply for the symmetry of the distribution of each variable.

2. (3 marks) Compute the 95% confidence intervals for the respective means of box_office_revenue, movie_budget, and audience_score.

3. (2 marks) Plot the density of box_office_revenue. Comment on whether the distribution is best described by a skewness value that is positive, negative, or zero. Provide a short (1-2 sentence) explanation of the economic factors of the movie industry that help to explain the skewness you observe.

4. (3 marks) Create a new variable, high_budget, which equals one if movie_budget is equal to or greater than the median value of movie_budget and zero otherwise. Plot 2 separate densities within the same graph for box_office_revenue when high_budget = 1 and for box_office_revenue when high_budget = 0. Interpret the differences in the conditional densities and provide a potential explanation for their differences in means.

5. (5 marks) Conduct the following test for difference in means:

-  H0: mean(box_office_revenue if high_budget =1) = mean(box_office_revenue if high_budget=0)

-  H1: mean(box_office_revenue if high_budget =1) ≠ mean(box_office_revenue if high_budget=0)

Report the difference means, 95% confidence interval for the difference in means, the p-value for the test, and whether the test implies a statistically significant result at the 5% significance level. Provide a brief interpretation of your findings by computing the per cent change in the conditional mean of box_office_revenue when going from high_budget = 0 to high_budget= 1.

6. (2 marks) Construct a scatter plot with box_office_revenue on the vertical axis and movie_budget on the horizontal axis. Use an appropriate single linear regression and abline() R to visualise the relationship in the scatter plot using  predicted values from the single linear regression. Does the pattern in the plot align with your findings from Questions 4 and 5?

7. (8 marks) Run the following (separate) single linear regressions:

•  Regression 1: dependent variable: box_office_revenue, independent variable: movie_budget

•  Regression 2: dependent variable box_office_revenue, independent variable: audience_score

Report coefficient estimates and standard errors assuming homoskedasticity for each regression in a single table.  In addition, for Regression 1, please interpret  the magnitude of the predicted change in box_office_revenue corresponding to a one standard deviation increase in the independent variable. For Regression 2, please interpret the magnitude of the predicted change in box_office_revenue corresponding to a 20 unit increase in the independent variable (i.e., an increase in the audience score of 20 points). Test whether the coefficients in Regressions 1 and 2 are different from zero at a 5% level of significance and report the p-value for each test.

8. (5 marks) Now run a multiple linear regression:

• Regression 3:

• dependent variable: box_office_revenue

• independent variables.: movie_budget, audience_score

Report the regression results using summary() or stargazer(), again assuming homoskedasticity. Provide two additional relevant scatter plots that help explain the change in the direction and magnitude of the coefficient with the coefficient  estimate on movie_budget in Regression 3 compared to Regression 1 from Question 7. In each scatter plot, use an appropriate single linear regression and abline() to visualise the relationship using predicted values from the single linear regression.

9. (7 marks) Suppose that you oversee a movie production studio. Your studio is

currently producing two films: one high-budget movie and one low-budget movie.  You have an extra $1 million in your production budget to allocate to one of these films. Re-run Regression 3 from Question 8 separately on two subsamples:

● Low-budget movies (high_budget==0).

● High-budget movies (high_budget==1).

Report the regression results using summary() or stargazer(), again assuming homoskedasticity. Assuming that each of the two films (high-budget and low-budget) will have the same audience score upon release, explain how you would use the results of Regression 3 to decide which movie should have the extra $1   million invested into its budget.

10. (2 marks) R-code: we will review and mark your R code as follows:

•  2/2 if the R code is correct and organised and commented like the solution code for the assignment.

• 1/2 if the R code is correct but hard to follow or not well commented.

•  0/2 if the R code is incorrect and/or a complete mess or not submitted.


发表评论

电子邮件地址不会被公开。 必填项已用*标注