ACF5320 Assignment 2

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

ACF5320

Assignment 2

ASSESSMENT TASK:     Assignment 2

WEIGHTING:               30%

COMPLETION:            Individual

GENERATIVE AI: Generative AI tools can be used in this assessment task

In this assessment, you can use generative artificial intelligence (AI) to generate the specified content in relation to the assessment task. This material must be acknowledged and recorded in your declaration of AI use.

DUE DATE:                   11:55pm, Wednesday, 9 April 2025

OVERVIEW

In this assignment, you are tasked with conducting regression analysis on multiple datasets provided in Excel format. The assignment is structured around three key cases, each requiring you to apply regression techniques to predict outcomes based on various independent variables. This exercise aims to assess your proficiency in predictive modelling, data analysis, and the interpretation of results within a business analytics context.

•    In the Decision case, using the "Decision.xlsx" dataset, you will analyse the impact of experience on decision-making quality among auditors, examining how it correlates with intelligence, thinking styles, and personality traits.

•    The Haircut case requires you to explore the "Haircut.xlsx" database to determine the factors that significantly influence a company's revenue, employing regression analysis to identify these key predictors.

•    The Prescription Cost Analysis involves the "Prescription.xlsx" dataset, where you will model and predict drug costs based on a set of independent variables, enhancing your model's accuracy through iterative refinement.

Your submission should demonstrate a thorough understanding of regression analysis as applied to predictive analytics. This includes not only the technical execution of statistical tests but also the ability to interpret and communicate the significance of your findings in a clear, concise manner.

Through this assignment, you will showcase your capability to leverage Excel for predictive modelling and to derive actionable insights from complex datasets.

OBJECTIVES

•     Understand and apply regression analysis techniques.

•     Analyse relationships between dependent and independent variables.

•     Interpret and evaluate regression model outputs.

•     Develop predictive models based on the analysis.

•     Communicate analytical findings effectively.

SUBMISSION REQUIREMENTS

Type your responses in a MS Word document and submit your Word document to Moodle. Cut and paste any relevant output from Excel into your Word document.

You do not need to clean the data and do not delete any data.

Case 1: Decision (10 marks)

Using the “Decision.xlsx” dataset, analyse differences between experienced and inexperienced participants.

(1.1)    Do the experienced versus the inexperienced auditors differ in the quality of their

decisions (i.e., the Decision variable)?  Cut and paste relevant statistics from Excel and explain the statistics. (4 marks)

(1.2)    Do the experienced versus the inexperienced differ in terms of any intelligence,

thinking style, or personality trait variables? Identify the ones that are different and provide the relevant statistics. Cut and paste relevant statistics from Excel and explain the statistics (only for those that are different). (4 marks)

(1.3)    Without using the language of statistics, what do you conclude about experienced

versus inexperienced auditors? (2 mark)

Decision data description

Participants consist of auditors and students. Auditors are considered experienced and students are inexperienced.

Variable

Definition

ID

Participant identification number.

Decision

Higher values indicate better performance on task requiring professional judgment.

WPT

Number of questions correctly answered on the Wonderlic Personnel Test. An IQ test. Higher scores indicate higher IQs.

FFM_agree

Response to the measures of the agreeableness factor in the Five Factor Model.

FFM_cons

Response to the measures of the conscientiousness factor in the Five Factor Model.

FFM_ES

Response to the measures of the emotional stability factor in the Five Factor Model.

FFM_extra

Response to the measures of the extraversion factor in the Five Factor Model.

FFM_open

Response to the measures of the openness factor in the Five Factor Model.

Exp dummy

0 = inexperienced, 1= experienced

Case 2: Haircut (5 marks)

Use the “Haircut.xlsx” database to run regression models that explain the factors that significantly influence revenue at this company.

(2.1)    Report and interpret your best model’s technical details. Cut and paste the relevant

statistics from Excel and explain the statistics. (2 marks)

(2.2)    Do you believe that your model is effective for explaining changes in revenue?  Explain

and justify your response. (2 marks)

(2.3)    Explain in plain language the meaning of your findings. (1 mark)

Haircut data description

You have been provided an Excel file that contains 4 data items.  Each row represents the data for one haircut at a business that operates in two countries. The business does not take appointments. Customers walk in and wait for a haircut.

Variable

Definition

Wait_time

the number of minutes the customer waited for the hair cut

Chair_time

the number of minutes needed to complete the hair cut

Revenue

revenue generated from the hair cut

Labour_cost

cost of labor for the hair cut

Country

dummy variable for country 1 and country 3

Case 3: Prescription Cost Analysis (15 marks)

Assume that you are working for a government agency that is trying to determine the main causes of different drug costs for different patients. You have data (“Prescription.xlsx”) from six months of drug prescriptions. You need to model and predict drug costs. The appendix shows  descriptions of the data.

(4.1) Assume that we are using this model: (3 marks)

GrossDrugCost = B0 + B1 * RiskScore + ε

i. Interpret the coefficient and the p-value for the RiskScore variable. Provide a practical explanation of the RiskScore variable for senior management. (1 mark)

ii. Explain what R-squared means in a statistical way and provide a practical explanation of the information to senior management. (1 mark)

iii. A coworker wants to know what the predicted gross drug costs would be for a new

member. The new member is a 73-year-old man who the government classifies as frail    and he has a risk score of 510. Using the model above, what would you predict the gross drug costs will be? (1 mark)

(4.2) Assume we are using this model: (8 marks)

GrossDrugCost = B0 + B1 * Risk Score + B2 * Age + B3 * Gender + ε

iv. Provide a statistical interpretation of the coefficient and p-value for the gender variable.

Provide a practical explanation of the information to senior management. (1 mark)

v. Provide a statistical interpretation of the coefficient and p-value for the age variable.

Provide a practical explanation of the information for senior management. (1 mark)

vi. Provide a statistical interpretation of this model’s intercept. Provide a practical explanation of the information to senior management. (1 mark)

vii. Compare the adjusted R-squared values between Models 1 and 2. Are they the same or  different? Why? What could you conclude about the differences (if any) in the adjusted R- squared values? (2 marks)

viii. Senior management wants to know the expected gross drug costs of the average

customer. That is, for the median value of the RiskScore, age and gender, what would you expect the average gross drug costs to be? (2 marks)

ix. A coworker wants to know what the predicted gross drug costs would be for a new

member. The new member is a 73-year-old who the government classifies as frail and he has a risk score of 510. Using the model above, what would you predict the gross drug costs will be if they were a man and if they were a woman? (1 mark)

(4.3) Create a better model (4 marks)

x. Develop a better regression model to predict gross drug costs. (2 marks)

xi. What did you learn from this model that previous models did not tell you? (2 marks)

Variables

Definition

RecordID

Primary key from the database that is a unique number for each row of MemberID;  A unique ID for each different member

Month

The month to which the data pertains, listed in numeric format as 1 for January, 2 for February, etc.

GrossDrugCost

The total amount of drug costs incurred by a member during the corresponding month

NLISDummy

A dummy variable that takes the value of 1 if the member is listed as non-low income by the government and 0 otherwise

LISCHOSERDummy

A dummy variable that takes the value of 1 if the member chose a specific plan and 0 if the member automatically was assigned a

plan, i.e., members automatically are assigned (thus, LISCHOSERDummy

RiskScore

A score assigned by the government based on previous

government data indicating how sick someone is, higher scores indicate members are sicker

SpecialtyDummy

A dummy variable that takes the value of 1 if the member utilizes specialty drugs and 0 otherwise

AdjudicationDays

The number of non-holiday workdays in a month Age

Gender

A dummy variable that takes the value of 1 if the member is female and 0 if the member is male

FrailtyDummy

A dummy variable that takes the value of 1 if the government

indicates the member is frail and 0 if the government indicates the member is not frail

HospiceDummy

A dummy variable that takes the value of 1 if the member is receiving hospice care and 0 if they are not

InstitutionDummy

A dummy variable that takes the value of 1 if the member is

receiving institutionalized long-term care (e.g., hospital, nursing facility) and 0 if they are not

ESRDDummy

A dummy variable that takes the value of 1 if the member is

receiving care for end-stage renal disease (i.e., end-stage kidney disease) and 0 if they are not




发表评论

电子邮件地址不会被公开。 必填项已用*标注