CS 484: Data Mining

Department of Computer Science

CS 484: Data Mining (Sections 001 and 002)

Spring 2024

Course Description

Concepts and techniques in data mining and multidisciplinary applications. Topics include data cleaning and transformation; classification and predictive modeling; clustering; association analysis; performance analysis and scalability; data mining in advanced database systems, including text, audio, and images; and emerging themes and future challenges. Students will gain hands-on experience and learn how to implement and apply various data mining algorithms.

Class Time and Location

Section 001: Tuesday/Thursday 1:30-2:45pm

Exploratory Hall L003

Section 002: Tuesday/Thursday 3:00-4:15m

Horizon Hall 2016

Instructor

Dr. Jessica Lin

Email: jessica [AT] gmu [DOT] edu

Office Hours: Tuesday/Thursday 11am-12pm

Teaching Assistant

Madhukar Vongala

Prerequisites

 Formally: Grade of C or better in CS 310 (Data Structures) and STAT 344 (Probability and Statistics) or equivalent.

 More specifically: Programming experience in Python, or willing to learn. Experience in Java or C++ will work as well, but the assignments will use the Python framework. Students should be familiar with basic   probability and statistics concepts, and linear algebra. Please expect lots of programming in the assignments.

Grading

Programming Assignments: 45%

Quizzes: 20%

Final Exam: 30%

Class participation/Activities: 5%

Extra credit: competition winners for homework

Assignments

There will be 4 competition-style programming assignments in Python. Competition winners will get 1% extra credit added to the final grade. You are allowed 3 days of grace period past the deadline, with 10% penalty each day. You will receive 0 credit if the homework is not submitted by then. Note that internet trouble is not a valid excuse for subbmitting late. Therefore, you should plan to submit a few hours early to avoid last minute technical difficulties.

Exams

There will be quizzes throughout the semester covering lectures and readings, and one final exam. The purpose of the quizzes is to help you stay on track of the lecture materials, so they are typically short and  easier compared to the final exam. The final exam is comprehensive. All exams are closed-book, and they must be taken at the scheduled time, unless prior arrangement has been made with the instructor. Missed   exams cannot be made up. The lowest quiz grade will be dropped.

Class Participation

You will be able to earn class participation credit through in-class activities.

Textbooks

Required: Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar (click on the link for the companion website)

Topics

 Ch.1: Introduction

 Ch.2: Data

 Ch.3: Classification

· Ch.4: Classification: Alternative Techniques

· Ch.5: Association Analysis: Basic Concepts and Algorithms

· Ch.6: Association Analysis: Advanced Concepts

· Ch.7: Cluster Analysis: Basic Concepts and Algorithms

· Ch.8: Cluster Analysis: Additional Issues and Algorithms

· Ch.9: Anomaly Detection

· Recommendation Systems

Honor Code Statement

The GMU Honor Code is in effect at all times. In addition, the CS Department has further honor code policies regarding programming projects, which are detailed here. Some examples can be found here . Any deviation from the GMU or the CS department Honor Code is considered an Honor Code violation. All assignments for this class are individual unless otherwise specified. ChatGPT or other Generative-AI models may NOT be used in this course as an assistant in the assignments.

Learning Disability Accommodation

If you have a documented learning disability or other condition which may affect academic performance, make sure this documentation is on file with the Office of Disability Services and then discuss with the professor about accommodations.


发表评论

电子邮件地址不会被公开。 必填项已用*标注