Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
COURSEWORK BRIEF:
Method of Submission: Electronic via Blackboard Turnitin ONLY (Please ensure that your name does not appear on any part of your work)
Days Late: |
Mark: |
1 |
(final agreed mark) * 0.9 |
2 |
(final agreed mark) * 0.8 |
3 |
(final agreed mark) * 0.7 |
4 |
(final agreed mark) * 0.6 |
5 |
(final agreed mark) * 0.5 |
More than 5 |
0 |
A. Knowledge and Understanding |
A1. Understand the potential of CRISP-DM and data analytics, particularly in the retail lending sector.
A2. Demonstrate a critical understanding of different types of data analytics methods and the problems they can solve. A3. Interpret the output of statistical techniques used for the main data analytics applications. |
B. Subject Specific Intellectual and Research Skills |
B1. Identify the statistical models appropriate for analysing the various decisions that confront a data analyst in different industries. B2. Work with software to develop data analytics solutions, such as predictive scorecards, clusterin models, and different types of regressions. B3. Assess the relevance of statistical package outputs to the decisions being addressed. |
C. Transferable and Generic Skills |
C1. Critically analyse practical difficulties that arise when implementing retail credit risk models; understand the cross-fertilisation potential to other business contexts (e.g., fraud detection, marketing, CRM, etc.). C2. Demonstrate an ability to use world-class software and to interpret its output in the relevant techniques. C3. Manage time and tasks effectively in the context of individual study. |
Question 1 (70 marks)
The dataset ‘Credit data.xlsx’ contains data on 10,000 borrowers and whether they subsequently experienced serious delinquency (see variable ‘SeriousDlqin2yrs’). Assume the lender now wishes to use this data to build a credit scoring model that predicts serious delinquency based on the other variables. The dataset contains the following variables:
Variable Name
|
Description
|
SeriousDlqin2yrs
|
Person experienced 90 days past due delinquency or worse
|
RevolvingUtilizationOfUnsecuredLines
|
Total balance on credit cards and personal lines of credit except real estate and no installment debt
like car loans divided by the sum of credit limits
|
age
|
Age of borrower in years |
NumberOfTime30-59DaysPastDueNotWorse
|
Number of times borrower has been 30-59 days past due but no worse in the last 2 years. |
DebtRatio
|
Monthly debt payments, alimony,living costs divided by monthy gross income |
MonthlyIncome
|
Monthly income
|
NumberOfOpenCreditLinesAndLoans
|
Number of Open loans (installment like car loan or mortgage) and Lines of credit (e.g. credit cards)
|
NumberOfTimes90DaysLate
|
Number of times borrower has been 90 days or more past due.
|
NumberRealEstateLoansOrLines
|
Number of mortgage and real estate loans including home equity lines of credit integer
|
NumberOfTime60-89DaysPastDueNotWorse
|
Number of times borrower has been 60-89 days past due but no worse in the last 2 years.
|
NumberOfDependents
|
Number of dependents in family excluding themselves (spouse, children etc.) |
- exploratory data analysis
- missing value handling (if any)
- outlier detection and treatment (if any)
- categorisation of the continuous variables (if deemed useful)
- Weights of Evidence coding (note that some additional coarse classification might be needed).
- Splitting the data set into a training and test set.
- The most important variables
- The impact of the variables on the target
- The performance of the model. Use various performance metrics and discuss their relationship if any.
- Result of scorecard.
- Compare this scorecard with the results of a Random Forest. Discuss your results.
- Why do banks typically use Logistic Regression as their base classifier? What do banks win and lose by doing this?
Please carefully report the various steps of your methodology and discuss your results in a rigorous way!
NOTE: It is unlikely that different students will come up with the exact same parameter estimates. Special consideration will be given to submissions whose estimates are identical.
Question 2 (30 marks)
- Management Science
- Operations Research
- INFORMS Journal on Computing
- INFORMS Journal on Applied Analytics
- Journal of Machine Learning Research
- European Journal of Operational Research
- Production and Operations Management
- Manufacturing & Service Operations Management
- ICDM (The IEEE International Conference on Data Mining)
- NeurlPS (Conference on Neural Information Processing Systems)
- KDD (ACM SIGKDD Conference on Knowledge Discovery and Data Mining)
2.1 Once you have found an appropriate paper, report the following in separate subsections (15 marks):
- Title, authors, and complete citation (e.g., journal name, volume/issue, year, …)
- The data mining problem considered
- The data mining techniques used
- The results reported
- A critical discussion of the model and results (assumptions made, shortcomings, limitations, …)
Make sure you demonstrate that you understand what the article is all about and are able to provide a critical discussion.
Do not copy and paste from the article. Using Turnitin, this will be easily detected!
NOTE: The reviewed methodology should be different from methods applied in Question 1.
Nature of Assessment: This is a SUMMATIVE ASSESSMENT. See ‘Weighting’ section above for the percentagethat this assignment counts towards your final module mark.
submission.
You should always include the word count (from Microsoft Word, not Turnitin), at the end of your courseworksubmission, before your list of references.
References: You should use the Harvard style to reference your assignment. The library provide guidance onhow to reference in the Harvard style and this is available from: http://library.soton.ac.uk/sash/referencing
Submission Deadline: Please note that the submission deadline for Southampton Business School is 16.00 for ALL assessments.
It is important that you allow enough time prior to the submission deadline to ensure your submission isprocessed on time as all late submissions are subject to a late penalty. We would recommend you allow 30minutes to upload your work and check the submission has been processed and is correct. Please make sureyou submit to the correct assignment link.
Email submission receipts are not currently supported with Turnitin Feedback Studio LTI integrations, howeverfollowing a submission, students are presented with a banner within their assignment dashboard thatprovides a link to download a submission receipt. You can also access your assignment dashboard at any timeto download a copy of the submission receipt using the receipt icon. It is vital that you make a note of your
The last submission prior to the deadline will be treated as the final submission and will be the copy that isassessed by the marker.
Important: If you have any problems during the submission process you should contact ServiceLineimmediately by email at [email protected] or by phone on +44 (0)23 8059 5656.
Special Considerations: If you believe that illness or other circumstances have adversely affected your academicperformance, information regarding the regulations governing Special Considerations can be accessed via the Governance and Policies landing pages: Regulations Governing Special Considerations (including Deadline Extension Requests) for all Taught Programmes and Taught Assessed Components of Research Degrees 2023-24 | University of Southampton
In 2023/24, the most common reasons for a breach of the regulations governing Academic Responsibility Conduct on your programme were:
Breach |
How to avoid |
Plagiarism – using the work, words, or ideas of another without properacknowledgement. Thisincludes citing work that you haven’t read.
|
- Always cite your sources.
- Only cite what you have read and used.
- “Direct quotes must be in quotation marks” with a page number if applicable.
- If you read about the work of another in a source, say ‘cited in’ and cite where you read it (see here for more info).
|
Collusion – Collaborating withothers in an unauthorized way to produce academic workmeant to be done independently.
|
-Unless permitted in a group assignment, don’t work with/alongside others.
- Don’t share your work with others.
- Ensure you are clear on where the line is. If in doubt, don’t do it.
|
External authorship – Obtaining or attempting to obtain unauthorized input fromanother person or service for academic work, e.g GenAI
|
- Ensure you are clear on if you are permitted to use GenAI.
- Ensure your work is always your own.
- Never send your work to others or upload it to a website.
-Keep records of your work including notes, drafts, and reading.
|
Penalties for the above include mark reduction, resubmitting for a capped mark, or a ‘0’ for themodule.
If you are in any doubt, please ask.
Further learning and advice can be found in the Academic Conduct & Responsibility Toolkit, and the Library Website.
Student Support: Study skills and language support for Southampton Business School students is available at: http://www.sbsaob.soton.ac.uk/study-skills-and-language-support/.