BSAN2205 MACHINE LEARNING FOR BUSINESS

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

BSAN2205 MACHINE LEARNING FOR BUSINESS

Project Report and Presentation – Briefing Notes

Background

The course BSAN2205 Machine Learning for Business has three assessment items: a project plan, a project report and presentation, and School-based take-home assessment (i.e., a “take home” exam). These notes outline my expectations for the project report/presentation. Keep in mind that the project report is intended as an extension of the project plan. The plan proposed a possible direction for your project – sketching a way to address a binary classification problem. The project report restates key elements of the plan and context, documents the methods and results of your project, and highlights the implications for the client (“the bank”). Note the project report should be prepared and submitted in PowerPoint format (whereas the project plan was in the form of a Word document). Further, you are asked to make a recording of a presentation (presenting your PowerPoint doc) and submit your R script.

The report should be comprehensive – sketching out all aspects of the project from beginning to end. However, the overall aim of the report should be to provide “proof of concept” for your approach and thinking; that is, initial evidence to support your claim that   something like the approach you are taking is capable of generating “business value” for the client. Draw on your proposal where it is appropriate to do so – you can use the text you wrote for the proposal (“selfplagiarism” in the project is OK, the project is an extension of the plan). Make some revisions if you think you can improve upon the proposal. For our purposes, the proposal is not a binding document. You are very welcome to revise your proposal if you think it needs revision. Perhaps you should signal to the reader of your   project report how your project differs from the proposal – if it does differ.

Importantly, you will need to find the right balance between the effort you put into the technical aspects of your project and in communicating the value of your project – with the  overall goal of having impact on the client business. This is perhaps the key decision you will have to make in coming weeks. One way to reconcile the possible tension between the demands of your project and the reporting requirements is to style your project as a pilot   project. Let me elaborate. I do not expect your report to provide a definitive answer to the business problem you are addressing (deepening customer engagement with a bank), but I do expect you to generate an indicative solution to the problem based on a thorough analysis of the available data. My expectation is that the report will provide indicative answers, show what is possible, and provide you with the experience to say with greater confidence what type of project would more definitively address the problem. What is the  next analytics project the client should consider, what data is needed, how would access to this data generate further insight into the problem at hand? Further, what other problems  might the bank address (using a binary classifier)?

Key Sections

The project report will likely have the following sections.

1. Background

2. Business problem

3. Proposed solution/method of analysis

4. Data

5. Analysis plan

6. Results and interpretation

7. Strategy/recommendations

8. Conclusion/research directions

You might be able to think of a different structure – this is a guide only, but one that should be useful. Per the project plan, the background section introduces the project, placing emphasis on the broad context and highlighting the specific domain of interest – in this case, the value of models for binary classification problems (with a specific focus on customer engagement). The general expectation is some key business process and/or management decision will be improved through the use of analytics reported in your project report (e.g., the bank’s marketing efforts).

The next section of the report should restate the business problem you are attempting to address. What is the problem you are trying to solve? A starting point might be to specify an outcome the target business/client is trying to achieve (for example, “customer engagement”). Addressing the following questions may help. (1) What is the specific outcome the business is trying to influence or understand, or both? (2) How is it manifest or measured? (3) What level of that outcome variable has the business achieved to date (or what is the average level of customer engagement – as measured by the likelihood of opening a new account), and what is the desired level of the outcome variable? (4) What are the likely factors or other variables that are potentially related to the outcome of interest, and how are they measured?

The section on the proposed solution or “line of attack” should focus on the method(s) of analysis – a particular model form(s) suitable for the problem you have outlined/specified  (logistic regression and decision trees). Describe the methods in some detail. Highlight the key steps and/or issues in their implementation. Per the briefing notes for the project proposal, the project work should place emphasis on logistic regression and decision tree models. The section should clearly identify and describe the key output and feature variables, including writing the model in formal fashion (perhaps supported by a path diagram). Describing the data may be a key part of your discussion of the proposed solution. In summary, this section should be specific about the proposed method of analysis and the data to be used.

The analysis plan is intended as a step-by-step guide to implementing the proposed approach. For example, a logistic regression analysis might involve several key steps, including a consideration of the appropriateness of the data for the planned application, the specification of the outcome and input variables, and specification of criteria that might be   used to assessed the fit of the model to the data (e.g., measures of classification accuracy).

This section may now be easier to (re)write than it was for the proposal – having now had the experience of analysing the data and developing the interpretations. Perhaps this section leads off with approaches to reporting and visualising the descriptive statistics that will be reported, before outlining your approach to predictive modelling. You may wish to put emphasis on bivariates rather than univariates prior to presenting your modelling effort. The results and interpretation may be in several parts. Firstly, I have asked you to apply logistic regression and decision tree models. Thus, you should specify and estimate logistic regression and decision tree models and report the results of these analyses. You might choose to report on the logistic regression results first and subsequently report the decision tree model. For each model form (logistic regression and decision tree), you should provide  information about overall model fit to the data (training and test sets) and information on classification accuracy. Further, you should provide an interpretation of the results – the coefficients in the case of the logistic regression model and the model diagram of the decision tree in the case of the decision tree. What variables are most important, how do you come to that conclusion?

Second, your results and interpretations section might include a comparison of the model forms – logistic regression and decision trees. This may require you to think about what metrics you can use to directly compare the two model forms. The model forms are comparable, because they have the same output variable (customers’ responses to the bank marketing campaign). Further, you might start to develop ideas about why one model form   might be preferred to the other, and why. Can you specify the conditions under which one model might be expected to out-perform the other, and therefore provide some guidance   on the bank in terms of which model form to use in the future – or for any one application?

Third, you may consider the use of ensemble methods. This is not a strict requirement, but you might for example consider the estimation of a bagged tree or random forest (as an extension of your decision tree model). Tuning a tree and estimating a bagged tree are probably things that can be done relatively quickly. By contrast, estimating a random forest may take much more computing time (e.g., 1-3 days of computing time on a standard Dell laptop compared to 10-30 minutes for tuning a tree and/or estimating a bagged tree). You might consider estimating a random forest if you first have success with a bagged tree. But  please think about the costs/benefits of this further analysis – a random forest – if you do decide to include it in your report.

Perhaps the key section of the report is the strategy and recommendations section. In effect, you present the implications of your project in this section. Think about the value of your results from the perspective of the client (the bank). What insights have you generated that might be important for the client to know? Can you think of four or five key points?

Given your results, are there things the client should do more of or things the client should do less or of stop doing? Where should the client start in actioning your results? What is the single most important thing the client should do? What should the client do first, second, etc. As you start to address these questions, you might find that your insights become the precursor for a set of recommendations. Also, think about some of the potential costs associated with your recommendations. What might be more or less feasible for the client?

Finally, I would like you to include comment on further research in your project reports.

Think about what should be done next – what is the next project, what does it look like?

One way to think about this question is to think about what you pilot project could have achieved if you had more time, more data, and more resources. What would you have done differently if you had had more time and resources? Another way to address the question of what further research should be done is to focus on one or more very specific findings from your project. These might be unusual or unexpected findings, or findings that may have great significance for the client but were only addressed in part in your project. A good project finishes by outlining the next project a business analysts might address. Think  broadly – are there other applications of binary classifiers the bank should consider?

Submission Guidelines

The report and presentation is worth 50 percent of your score in the course. I expect your PowerPoint doc to be in the range of 20-30 slides – you may include further slides in the form of an appendix. You can start your PowerPoint doc now – extract the key information from your proposal you wish to bring forward into the report, or rewrite or write these early sections including the business problem, proposed solution, method of analysis, data, analysis plan. Think about how best to present your results. Recall that visual and statistical information might be reported – supported by short written descriptions of the key results.

Further, you are asked to record yourself presenting your report and to submit the presentation file (maybe use Zoom to make the recording – in effect, a voice-over of your PowerPoint report or you can embed the audio within the PowerPoint file itself). The presentation should be approximately ten minutes in length and highlight the key aspects of your report, including the motivation and approach, and the key implications for the client. Finally, I am asking for your R script (the .R file or files). Only submit your R script file(s), not   your Rdata file(s).

You can submit your work via blackboard, and you can do so in one of two ways. I am specifically asking you to submit your report and presentation, and your R script.

One approach to submitting is to submit these three files to their respective blackboard link.

• The .pptx file (the PowerPoint file)

• The .mp4 file (the Zoom recording of your presentation)

• The .R file (the R script)

A second approach to submitted is to submit your PowerPoint file and your R script – use this approach if you decide to embed your voice-over audio in your PowerPoint file.

Please try using the link, of course you can always email me if you have trouble with them.

Assessment Criteria

In scoring the reports, I will be looking for three things. Firstly, I want to see that you have carefully attended to each aspect of the project. This includes seeing the link between the business problem and data; thinking about the model specification(s); using appropriatemodel forms to analyse the data and using these models form appropriately; including reporting and interpretations; and seeing the link between the results of your analyses and  the business strategy. Second, I will be looking at the logical flow of your reports – the links  between the sections are important (in addition to the content of each section, per the first point above). Third, I would like you to see the broader implications of your project for the   client. This might include a recognition that the bank confronts other business challenges, which the methods of binary classification can usefully address. I will separately provide you with a specific set of marking criteria.


发表评论

电子邮件地址不会被公开。 必填项已用*标注