Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
Final Assessment Information
Context
In this assessment, you will make use of a large sample of corporate financial statement data, stock market data, and textual data to study the implications of corporate financial reporting for future stock returns.
Data
The data of potential use are store in the SQLite database “analytics_final.db”. The database consists of three tables. The data dictionary of the database is provided in Appendix A. Note that the data may contain nontrivial missing values, duplicates, errors as well as outliers. You should carefully examine these data quality issues before proceeding to complete the tasks required below.
Required
1. Publicly listed firms occasionally experience stock price crashes, which refer to sudden dramatic declines in stock prices. In the largest attainable sample, conduct descriptive analyses to demonstrate how the frequency of stock price crashes varies over time and across different industries. [20%]
2. One proposed explanation for stock price crashes is that companies' disclosures, such as annual reports, are overly complex. The excessive length and complexity of these reports can make them difficult to read, preventing investors from effectively identifying bad news hidden within the disclosures. Based on this, it is predicted that the annual reports of companies that experience stock price crashes in a given year are less readable than those of companies that do not experience a crash. Please use a t-test to test this prediction. [25%]
3. Develop two logistic regression models to predict whether a firm will experience at least one stock price crash during the next fiscal year.
-
The first model should include the following predictors: market capitalization (mktcap), market-to-book ratio (mtb), return on equity (roe), leverage (lev), earnings volatility (earnvol), cash flow volatility (cfvol), sales volatility (salevol), the abnormal accruals measure (accm),stock beta (beta), and stock return for the current fiscal year (mkt_adj_return).
-
The second model should include the same predictors as the first model, but also incorporate measures of the readability and sentiment of the current year's annual reports. You may use the provided variables to calculate or measure these aspects.
-
Use data from all observations before 2014 (inclusive) as the training sample, and data from 2014 onward as the test sample. After building the models, compare their performance using classification evaluation metrics, and discuss your findings.[30%]
4. Some traders believe that investors do not adequately respond to changes in negative sentiment within corporate annual reports. If this hypothesis holds, we would expect that firms with a year-on-year increase in negativity will underperform relative to other firms in terms of stock price over the next year. Please design and implement a test to evaluate this prediction.[25%]Appendix A: Data Dictionary of Database “analytics_final.db”
Table: finstat
Column
Type
Definition
gvkey
Integer Identifier of the firm
datadate
Date
Date of the fiscal year end
fyear
Integer Fical year of the observation
sale
Float
Total sales revenue over the fiscal year
at
Float
Total assets at the fiscal year end
ceq
Float
Total equity at the fiscal year end
mktcap
Float
Stock market capitalisation (in millions of dollars) at the fiscal year end
mtb
Float
Stock market capitalisation divided by book value of equity at the fiscal
year end
lev
Float
Total debt divided by book value of equity at the fiscal year end
roa
Float
Operating profit of the fiscal year divided by total assets at the fiscal year
end
roe
Float
Net profit of the fiscal year divided by book value of equity at the fiscal
year end
earnvol
Float
Standard deviation of net profits (scaled by end-of-year total assets) over
the five years prior to the current fiscal year end
cfvol
Float
Standard deviation of operating cash flows (scaled by end-of-year total
assets) over the five years prior to the current fiscal year end
salevol
Float
Standard deviation of sales (scaled by end-of-year total assets) over the
five years prior to the current fiscal year end
accm
Float
Abnormal accruals measure of the fiscal year divided by total assets at
the fiscal year end
ffi12_desc
Text
Description of the firm’s industry
Table: mktdat
Column
Type
Definition
gvkey
Integer Identifier of the firm
fyear
Integer Fiscal year of the observation
crash
Integer An indicator variable that equals 1 if the company’s stock experience at
least one stock price crash during the current fiscal year, zero otherwise
stock_illiquidity Float
Illiquidity of the firm’s stocks
bid_ask_spread Float
The spread between the best ask price and the best bid price of the firms’
stocks
beta
Float
CAPM beta of the company’s shares
return_volatility Float
Standard deviation of daily stock returns over the fiscal year
mkt_adj_return
Float
Total stock return of the company over the fiscal year, minus the stock
return of the overall market index
Table: textdat
Column
Type
Definition
gvkey
Integer Identifier of the firm
fyear
Integer Fical year of the observation
num_words
Float
Total number of words in the annual report of the current fiscal year
num_neg_words Float
Total number of negative words in the annual report of the current fiscal
year
num_pos_words Float
Total number of positive words in the annual report of the current fiscal
year
file_size
Float
The file size of the annual report of the current fiscal yearSubmission Requirement
1. You must submit an assignment report in PDF format via Turnitin. You re also required to submit a Jupyter Notebook file (.ipynb file) containing all your Python code through the separate submission link. No other forms of submissions will be accepted.
2. The deadline for both submissions is 11:59am, 20 November 2024. Late submissions of either deliverable will attract 10% penalty per day after the deadline. You are allowed to update your submissions before the deadline.
3. The assignment report should include responses to all the required tasks. Screenshots of key outputs of the analyses (e.g., tables, graphs) should be embedded in the report followed by clear explanations. In addition to responses to the required tasks, the report should also present details of the data cleaning procedure performed and justify the choices. Your code in the Jupyter Notebook file must be able to replicate all the findings referenced in your report.
4. The report should not exceed 8 pages long, everything inclusive. Use Times New Roman font, 1.5 line spacing, and 11pt font size. No cover page is required.