MET AD 699 Data Mining for Business Analytics

MET AD 699
Data Mining for Business Analytics

Instructor:
Greg Page, MBA, Ed.M.
Classroom:
1010-101
Meeting Time:
Thursdays, 12:30 p.m.-3:15 p.m.
Teaching Assistant:
Zhenyan Yin ([email protected])
Recitation Times: TBD
Office Hours:
Office hours are casual, unstructured, and unrecorded. No appointment is needed -- you may drop in at any time during these sessions.

Office Hours:

Mondays (Virtual Only): 2:30 p.m. - 4:00 p.m. (Zoom Room 982-294-4491)

Fridays: 3:00 p.m. - 4:30 p.m. (1010-404, in-person)

Other Times/Dates: By Appointment

Office hours will start after our 2nd class session.

Course Description

Enterprises, organizations and individuals create, collect, and use massive amounts of structured and unstructured data in order to convert the information into knowledge, to improve the quality and the efficiency of their decision-making process, and to better position themselves to the highly competitive marketplace. Data mining is the process of finding, extracting, visualizing and reporting useful information and insights from both small and large datasets with the help of sophisticated data analysis methods. It is part of the business analytics, which refers to the process of leveraging different forms of analytical techniques to achieve desired business outcomes through requiring business relevancy, actionable insight, performance management, and value management. The students in this course will study the fundamental principles and techniques of data mining. They will learn how to apply advanced models and software applications for data mining. Finally, students will learn how to examine the overall business process of an organization or a project with the goal to understand (i) the business context where hidden internal and external value is to be identified and captured, and (ii) of exactly what the selected data mining method does. [4 credits]
Prerequisites: AD571, edX-based pre-analytics laboratory ADR100 (in some cases, AD699 may be taken concurrently with AD571)

Course Content, Learning Objectives and Outcomes

  1. This course enables students to develop experience in the following areas:
  2. Theoretical and practical understanding of core data mining concepts, techniques, and business applications
  3. Systematic approach to framing and solving business analytics problems with the help of data mining methods and techniques;
  4. Ability to identify the right data mining tools and techniques for various business analytics problems;
  5. Hands-on experience in using the most popular business analytics and data mining tools and preparation for applying for job positions where familiarity with those tools is required.

Course Norms

Throughout our journey together in AD699, I will ask you to adhere to the following set of course norms:
  • Assume the best:. Some things in this course will be simple; others will be difficult. Rest assured that nothing is intentionally designed to trick you. I’m not perfect and I can make mistakes in haste. If I post a file that says “Assignment #2 prompt” and it’s actually my grocery shopping list, then please assume that I have made a careless error, and not that it’s part of some elaborate scheme to trick you.
  • Monday Blackboard Announcements: Every Monday, I’ll make an announcement in Blackboard that will include a few bullet points that mention the topics that we’ll explore that week, along with upcoming due dates.
  • We’ll Always Have a Break in the Middle: Our class period is long. It will never consist solely of me lecturing from the front of the room. Also, we will always take a break of approximately 10 minutes, typically about an hour after we start.
  • We’ll Always Start on Time. Slides will always be posted prior to start time: Class will always begin at exactly the official start time. Slides for each class will be posted prior to the start, so that you can follow along on your laptop if you wish to.
  • < 24 hours turnaround on e-mails, <7 days on homeworks: I will never go more than 24 hours without checking my BU e-mail. I will respond to all student e-mails in less than 24 hours. I will sometimes respond much faster than this, but please allow up to 24 hours. All submitted homeworks will be graded with comments within 7 days of the submission deadline (note: that’s not the timestamp of when you submitted your work, but the due date/time for the assignment). I will sometimes grade homeworks in batches of 2 or 3, so if your friend’s homework was graded yesterday but yours still isn’t, there is no need to panic.
  • Effort matters: I realize that for many students, AD699 requires some steps outside of the “comfort zone.” Students who maintain a positive attitude and who put forth a strong effort tend to do very well in this course, regardless of what their knowledge level was on Day 1.
  • Two things that might not be your fault, but shouldn’t be showstoppers -- attendance & the book: Life happens. Between job interviews, illnesses, family events, etc. you might miss a class. If you do, you can review the material by checking the slides on Blackboard. You can seek me out for extra help with assignments. “I missed that class” is not a valid excuse to simply not complete an assignment. The same applies to the book. Owning a copy of the textbook is a course requirement. How you fulfill that requirement is up to you. However, “I don’t have the book” is a weak excuse for not being able to complete an assignment, especially because the homework assignments are not directly based on the book material.
  • No AD699 solution will rely on domain expertise: We will use many different types of datasets in AD699, including material related to sports, finance, entertainment, and other topics. The datasets are used to illustrate important concepts. You will not need to possess arcane knowledge about sports statistics, finance, real estate, or any other topic to complete an assignment in this course.
  • If you see a msitake? Let me know. Remember rule #1. Every iteration of AD699 includes material that has never been included previously.
  • If it ain’t raining? We ain’t training! Things can always go wrong, and sometimes when you least expect them to. My computer could decide on a forced Windows update 2 minutes prior to class starting. The projector bulb in the room could burn out just before class. Some of the power outlets in the room might not work. If/when any of these things occur? We won’t miss a beat -- we can always adjust, adapt, and overcome.

Course Materials

REQUIRED TEXT

Galit Shmueli at al: Data Mining for Business Analytics: Concepts, Techniques, and Applications in R, Wiley 2018, Hardback ISBN: 978-1-118-87936-8;
e-Book ISBN: 978-1-118-87933-7. Whether you buy a physical copy or a digital copy is completely up to you.
Wickham, Hadley and Garrett Grolemund: R for Data Science, O’Reilly January 2017. Free download -- http://r4ds.had.co.nz/
We will use the R For Data Science text for some for in-class coding exercises. If you prefer to learn from a physical book, I recommend that you buy a copy. However, the free online version is identical to the paperback version.

AD699 Video Library

Several videos related to data mining, machine learning, and the R language are available on the class Blackboard page. Most of the videos can be considered optional -- they are there as a learning aid that you can feel free to use at your discretion. Prior to quizzes, I will specify which videos are considered “inside the scope” of the quiz.

SOFTWARE

R, version 4.1.0 (or any other version)
RStudio

VIRTUAL LABORATORIES

For directions to get free remote access to our BU MET Virtual Labs, please visit
http://www.bu.edu/metit/pc-labs/virtual-labs/

Grading Structure

Grading Structure and Distribution

Your performance in the course will be graded in the following areas:
Attendance, Participation, and Professionalism
10%
Quizzes (3 total)
50%
Individual Assignments (5 total)
20%
Group Project (Written Submission)
15%
Final Presentation
5%
Additional details for each grading component are provided below:
Assignments: Assignments will be graded based on a combination of accuracy of the analysis and quality of the report, with most of the weight being placed on the student’s ability to properly interpret the results. More specific information for the format and the contents of the assignments is available on the course Blackboard page.
Quizzes: Each of the three quizzes will consist of 15 questions. The quizzes will be completed in class, during a 60-minute block of time. Quizzes will be open-note and open book.
Attendance, Participation, and Professionalism: This is a 10-point grading component. More will be said about this in class. There is very little dispersion in this category. No student will receive less than an 8 in this category without first being notified by the instructor.
Team Project: The team project will enable students to apply many of the data science tools and techniques covered in the course. Students will work in teams on this project, which will involve a real-world dataset. More information about this project will be made available on Blackboard.
Final Presentation: Each team will deliver a 15-minute presentation during our last class session. More specific information for the format and the content related to this presentation can be found in the Project folder on the course Blackboard site. This folder will be made available several weeks after the start of the semester.
The overall grading distribution for the course will lead to a class average of approximately 3.4. No student who regularly attends class, completes all assignments, takes all quizzes, and participates in the group project will earn any grade that could jeopardize his/her standing at BU MET. More will be said about this grading policy during class.

Submission Format

Assignments may be submitted in any format that clearly displays the process that the student used, the answers found, and the interpretation statements for the questions that ask for explanation. Students may submit assignments using R Markdown, but this is not a requirement. More will be said about assignment format during class.

Timely Presentation of Materials Due

All work requests from the instructor (quizzes, assignments, contributions in the teamwork, etc.) have due dates. These are the last dates that stated material is due. This means that it is a good idea to set personal targets before then as your personal completion date to avoid difficulties. Dates are often viewed by students as the date to turn in an assignment. We view assignment due dates as the last date on which to turn in an assignment. With this caution, please note that we are not inclined to accept late work; if late work should be accepted it will be done only after considerable weighing of rationale, and with penalty.

Academic Integrity

Students are expected to adhere to the highest standards of honesty and integrity for this course. University policy on academic integrity will be followed to the fullest. Students are encouraged to review the university policy on academic integrity including a detailed listing of activities warranting sanction. Anyone who fails to adhere to these requirements and/or otherwise engages in unethical behavior (including cheating on exams, false representation of self or one’s work efforts, use of unauthorized aids, etc.) will be referred to university administration for further action. In particular, the university's policy and consequences regarding plagiarism are clearly described in the official Boston University documents, and will be enforced without any compromises.

Request for Accommodations

If you have a disability and will be requesting accommodations for this course, please inform the instructor early in the semester. Advance notice and appropriate documentation are required for accommodations.
Satisfaction of Department-Wide Goals
#
Goals
Category
Compliance
1
Critical and innovative thinking
Substantial
With the help of the assignments and individual exercises, students are expected to learn and choose the appropriate data mining model for problem solving and decision making.
2
International perspective
Some
The examples discussed in some data mining approaches and modules are applicable to both national and international organizations.
3
Communication skills
Substantial
Students are expected to participate in weekly group discussions, which support the development of communication skills.
4
Decision making
Substantial
Quantitative decision making is emphasized throughout the course.
5
Technical tools & techniques
Substantial
The course introduces a variety of tools and techniques including MS Excel based Frontline Analytic Solver Platform and R One.
6
Research skills & scholarship
Substantial
The course asks students to complete several assignments. In each assignment, students are asked to construct data mining models and apply decision support tools.
7
Professional ethics & standards
Substantial
The importance of professional ethics and standards emphasized throughout the weekly discussions.
8
Creative & effective leaders
Substantial
Understanding data mining and other business analytics models and using them for decision-making is critical for
becoming creative and effective leaders

Course Outline

Class Date:
Lectures & Topics
Readings (from Shmueli text)
02SEP
Topic 1: Course Intro; Identify Opportunities & Collect Data; Data Exploration in R
Ch. 1, 2
09SEP
Topic 2: Data Visualization Part I
Ch. 3
16SEP
Topic 3: Data Visualization, Part II
Ch. 3
23SEP
Topic 4: Simple Linear Regression/Model Evaluation
Ch. 5
30SEP
Topic 5: Multiple Linear Regression
Ch. 6
07OCT
Topic 6: k-nearest neighbors, measuring distance between records
Ch. 7
14OCT
Topic 7: Naive Bayes
Ch.8
21OCT
Topic 8: Classification Trees
Ch. 9
28OCT
Topic 9: Association Rules
Ch. 14
04NOV
Topic 10: Clustering
Ch.15
11NOV
Topic 11: Text Mining
Ch.20
18NOV
Topic 12: Social Network Analysis
Ch.19
25NOV
Thanksgiving (No Class)

02DEC
Topic 13: Model Deployment & Next Steps

09DEC

Topic 14: Semester Presentations & Lessons LearnedTeam 

Project Write-Ups Submitted by 11:59 p.m. on Monday, 08DEC

N/A

Individual Assignment Due Dates:

Assignment #1: Due by 11:59 p.m. Monday, 20SEP
Assignment #2: Due by 11:59 p.m. Monday, 06OCT
Assignment #3: Due by 11:59 p.m. Monday, 20OCT
Assignment #4: Due by 11:59 p.m., Monday, 08NOV
Assignment #5: Due by 11:59 p.m., Monday, 22NOV

Quiz Dates:

Quiz #1: Thursday, 07OCT
Quiz #2: Thursday, 04NOV

Quiz #3: Thursday, 02DEC

All quizzes are open-note/open-book.

发表评论

电子邮件地址不会被公开。 必填项已用*标注