Unit: EFIM20010 Applied Quantitative Research Methods
Assessment’s contribution to unit: 50%
Release Date: 21st February 2024
Submission Date: 20th March 2024 at 11:00 am
Feedback Released: 15th April 2024
Empirical Project
1. Data
Download the relevant Stata file from Blackboard and save it to your hard disk. There is no need for any additional data.
2. Robust and secure working methods
Throughout your work on the project, you should work on the assumption that something will go wrong (computer failure, etc.) and that it will probably go wrong at the worst possible moment (e.g. shortly before the submission deadline). It is your responsibility to back up work in multiple secure locations (e.g. on a laptop, server and USB stick) so that you can recover any work quickly should that be necessary.
3. Project structure
Your project must contain the four components described below.
You must structure your Project to contain a Non-Technical Summary (Component 1). You have complete discretion to structure the remainder of the Project in any way that you think will best demonstrate your understanding of the material. In particular, there is no obligation to structure your project exactly in line with the remaining components listed below: for example, it may make sense to combine components 3 and 4, since you are explicitly asked to compare and contrast their results.
We have provided an example project to give you a better idea of the type of thing we are looking for. Please read through it in detail.
• Component 1
Write a short Non-Technical Summary (NTS) which must explain the question being analysed and your conclusion. This should be about half a page. The NTS should not contain unnecessary technical language. Numbers should be expressed to a suitable degree of precision (e.g., use three decimals (for example, “0.472658” should be rounded to “0.473”) or use words (“about half”)). The NTS must be freestanding in the sense that it can be read and understood without any reference to the rest of the Project. It should not be on a separate page from the rest of the Project.
• Component 2
Summarise your data with a table of descriptive statistics. This should be in the form of a table, but make sure not to copy-and-paste Stata output tables directly (i.e., create your own tables, round numbers to three decimal places). Illustrate relevant aspects of the data with one or more graphs. Explain the question being analysed and discuss the data carefully. The table and figures should be included in the text, not put at the end of the document.
• Component 3
Estimate the simple regression requested in the project description. You should report your regression results in a table whose layout should be consistent with standard econometric table layout and have a note explaining the contents (again, do not simply copy-and-paste Stata output, use three decimal places). Discuss the results of the basic regression carefully (this includes an interpretation of the results,a discussion of statistical significance and tests of relevant hypotheses). Your discussion should refer back to the substantive question under consideration.
• Component 4
Estimate one or more additional regressions, at least one of which should contain a control variable. Report your results (to save space and facilitate comparison, these can be in the same table as the basic regression from component 3). Compare and contrast these regression results with the basic regression. Discuss and evaluate your results fully.
• Component 5
A conclusion discussing your answer to the initial research question and, if relevant, any subsequent issues that have arisen in your analyses.
Please choose one of the following two projects:
Project 1: Wage returns to Physical Activity
TOPIC: How might a worker’s physical activity be related to their labour market outcomes, productivity and wages specifically?
BACKGROUND
Empirical evidence for the direct effects of yon labour market outcomes is limited. However, there is a growing amount of support on the positive effects of physical activities on health on the one hand and on the effects of good health on labour market outcomes on the other hand. It is widely believed that time constraints and limited economic resources (e.g., lower socioeconomic status and income) are negatively associated with physical activity in high-income countries (see, e.g., Brown and Roberts, 2011, Meltzer and Jena, 2010). Similarly,there is fairly robust evidence that physical activities have a positive effect on health and that health is systematically related to labour market outcomes (see, e.g., Currie, 2009, Gomez-Pinilla, 2008, Lakdawalla and Philipson, 2007, Nys, 2006, Rashad, 2007).
This project examines whether and how a worker’s physical activity patterns are related to their labour market outcomes, productivity and wages specifically. The most obvious potential causal mechanism behind the relation between physical activity and wages could be that higher level of physical activity improves health and thereby people's labour market attachment (due to, e.g., reduced absenteeism from work) and productivity at work (Lechner, 2009), leading to higher wages. There may also be other mechanisms at work. Aguilera and Bernabe (2005) argue, for example, that participation in physical activities is often a social activity (e.g. team sports) which enhances people's social skills that are then later rewarded in the labour market.
SIMPLE REGRESSION
Using across-section of workers, the simple regression is of the form:
(A. 1) log(salaTyi ) = a + βexeTcisei + εi
where salaTyi is worker i’s annual salary; a is a constant or intercept term; exeTcisei is a variable describing worker i’s typical physical exercise habits (number of minutes devoted to physical exercise, per week), with β its corresponding parameter; and εi is the error term. Since salaTyi is a continuous variable, this regression is estimated using ordinary least squares (OLS).
GENERAL REGRESSION
There are different ways to generalise the simple regression specified above. First, it is possible to use more information on workers’ characteristics, such as age, gender, education and job experience. For example:
(A. 2) log(salaTyi ) = a + βexeTcisei + yxi + εi(′)
where X is avectoT ofX1, X2, X3, etc., which are additional control variables, with Y its corresponding vector of parameters.
It would also be possible to explore interactions between variables, such as estimate whether the relationship between the salary and exercise time is similar for the workers who are employed in jobs that require physical activity and other workers.
(A. 3) log(salaTyi ) = a + βexeTcisei + δexeTcisei × constTuci + YXi + εi(′)′
Where exeTcisei × constTuci is an interaction term between variable exeTcise and a dummy for construction worker constTuc, X is avectoT ofX1, X2, X3, etc which are additional control variables, with Y its corresponding vector of parameters.
Since these are different regressions, they have different residuals and hence we have distinguished between εi, εi(′), etc. You should estimate at least one general regression with at least one additional control variable. You can choose to include in your general regression model more than one additional control variables, but you do not need to use all variables that are available in the dataset. There may also be missing values on some variables for some observations. You should clearly describe the data that you use for your regression analysis in your Component 2 (Note that you should not include in your Component 2 the description of those variables that you don’t use in your regression analysis).
DATA
The data on Blackboard contains the following variables for 513 randomly selected individuals.
Variable |
Description |
age |
age in years |
case |
identifier |
clerical |
=1 if clerical worker |
construc |
=1 if construction worker |
educ |
years of schooling |
gdhlth |
=1 if in good or excellent health |
inlf |
=1 if in labor force |
male |
= 1 if male |
marr |
= 1 if married |
selfe |
=1 if self employed |
spsepay |
spousal wage income |
spwrk |
=1 if spouse works |
totwrk |
minutes worked per week |
exper |
years of experience (calculated as age - educ – 6 ) |
yngkid |
=1 if an individual has a young child (less than 3 years old) |
hrwage |
hourly wage |
agesq |
age squared |
exercise |
Minutes of exercise per week |
Project 2: CEO pay and firm performance
TOPIC: What is the effect of firm’s return on equity on CEO pay?
BACKGROUND
Every company needs Chief Executive Officers (CEO) to run the daily basis of the firm’s activities. CEO are appointed by the board of directors, and they have the highest-level of executive position in a company who is responsible to create and to carry out the high-level strategies, corporate decision making, operations and resources of a firm as well as acting as the middle person between the board of directors and the corporate management. Naturally, CEOs have a higher salary than all other workers in the firm, but how are CEOs paid?
This project examines whether and how a firm´s return on equity is related to CEO annual wage. It is widely believed that when the performance of a firm increases then the CEO pay would increase too. Mainly because of CEO pay and firm profitability are directly related to each other. The basic idea is to encourage the CEO to act in the shareholders’ interest. Through this idea, firm that depends on the provision of incentive compensation will make the firm stand strong and survived. Firms that failed to compensate managers in thisway will not compete successfully with firms whose managers that act in accordance to the shareholders’ interest. In support of this view, a study that was done by Sigler (2011) found that there is a positive and significant relationship between CEO pay and company performance measured by return on equity. Another study by Ozkan (2007) has reported that firm performance has a positive but insignificant impact on CEO pay.
SIMPLE REGRESSION
Using across-section of firms, the simple regression is of the form:
(B. 1) payi = a + βROEi + εi
where payi is CEO salary (annual salary in thousands of dollars); a is a constant or intercept term; ROEi is firm´s return on equity (ROE = NeEquity(t Incom)e × 100, measured in percentage); and εi is the error term. Since payi is a continuous variable, this regression is estimated using ordinary least squares (OLS).
GENERAL REGRESSION
There are different ways to generalise the simple regression specified above. First, it is possible to use more information on firm characteristics, such as firm’s annual sales, firm´s return on sales (ROS = Nes(t)ale(Inc)s(o)me × 100, measured in percentage), and industry dummy variables. For example:
(B. 2) payi = a + βROEi + YXi + εi(′)
where X is avectOT OfX1, X2, X3, etc., which are control variables, with Y its corresponding vector of parameters.
It would also be possible to adopt a non-linear regression form:
(B. 3) log(payi ) = a1 + a2 log(salesi) + βROEi + YROSi + εi(′)′
Since these are different regressions, they have different residuals and hence we have distinguished between εi, εi(′), etc. You should estimate at least one regression with at least one additional control variable.
You can choose to include in your general regression model more than one additional control variables, but you do not need to use all variables that are available in the dataset. There may also be missing values on some variables for some observations. You should clearly describe the data that you use for your regression analysis in your Component 2 (Note that you should not include in your Component 2 the description of those variables that you don’t use in your regression analysis).
DATA
The data on Blackboard contains the following variables for 209 randomly selected firms.
Variable |
Description |
salary |
1990 CEO salary, thousands $ |
sales |
1990 firm sales, millions $ |
roe |
return on equity, 1988-1990 average |
ros |
return on firm's sales, 1988-1990 average |
indus |
=1 if industrial firm |
finance |
=1 if financial firm |
consprod |
=1 if consumer product firm |
utility |
=1 if transportation or utility firm |
Marking Criteria
The projects will be graded according to the University grade descriptors. In particular, in this assignment, we will be looking for the following:
• General
o Does the project fit within the required word count?
o Does project fit within the rubric?
o Does the project use the discretion allowed within the rubric to provide a good analysis of the topic?
• Data and Estimation
o How well are the data described and discussed?
o Is the simple regression correctly estimated and how well is it explained and interpreted?
o How well are the more general regressions conceived, explained, compared and interpreted?
• Testing and question
o How well are hypotheses constructed, explained and discussed?
o How clearly is the testing procedure described?
o How well are the tests interpreted?
o How well are the estimation results and tests synthesised with the substantive topic being considered?
• Presentation
o How well is the project presented (spelling, punctuation, grammar, layout, clarity, referencing, quality and relevance of figures, tables, etc.)?
o How well does the Non-Technical Summary explain the key issue or issues?
o Is the project clearly and concisely written?