STA4634/5635 Applied Machine Learning

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

STA4634/5635 Applied Machine Learning

Course Information 

Class Meeting Place: HCB 313

Class Meeting Time: MW 3:35-4:50pm 

Instructor: Dr. Adrian Barbu 

E-mail: [email protected] 

Office: 106C OSB 

Phone: 850-290-5202 

Office Hours: Tuesday 3:00-5:00pm or by appointment 

Teaching Assistant: Lizhe Sun 

E-mail: [email protected] 

Office: 204 OSB 

Office Hours: Monday 10:00am-12:00pm 

Textbooks (optional): 

1. The Elements of Statistical Learning by T. Hastie, R. Tibshirani, and J. H. Friedman (publisher: Springer) http://www.stanford.edu/~hastie/ElemStatLearn/printings/ESLII_print12.pdf 

2. Pattern Recognition and Machine Learning by Christopher M. Bishop (publisher: Springer) 

3. Machine Learning by Tom M. Mitchell (publisher: McGraw-Hill) 

All textbooks are optional since the course will not follow any particular book. 

Course Objectives: At the end of the course, the student will: 

– be able to understand many machine learning methods with their advantages and disadvantages 
– be able to implement the methods or know where to obtain them from 
– be able to use existing library software – have a working knowledge of most of the methods 

– be able to determine most appropriate learning method for a specific application 

Course topics: This course is an overview of statistical methods for supervised, unsupervised and weakly supervised learning. The following topics will be covered: 

• Decision Trees, Random Forests 

• Naive Bayes Classifiers 

• Linear and Logistic Regression 

• Generative and Discriminative Learning 

• Learning with regularized loss functions 

• Neural Networks 

• Large Margin Classifiers: Support Vector Machines, Kernel Methods 

• Boosting: AdaBoost, LogitBoost, RealBoost, GentleBoost 

• Feature Selection with Annealing 

• Efficient Inference: Marginal Space Learning 

• Learning Issues: Overfitting, Bias-variance tradeoff 

• Learning Theory: PAC learning, VC Dimension 

• Graphical Models, Hidden Markov Models, Conditional Random Fields, Belief Propagation 

• Semi-supervised Learning 

• Unsupervised Dimensionality Reduction: PCA, Factor Analysis, ICA 

• Supervised dimensionality reduction: Feature Selection, Fisher LDA, Hidden layers in NN 

• Nonlinear Dimensionality Reduction: Kernel PCA, Multi-dimensional scaling (MDS), Isometric mapping (ISOMAP), Local linear embedding (LLE) 

• Maximum Entropy models: FRAME 

• Using Incomplete Data: MLE and EM 

• Unsupervised learning: K-means, EM, Spectral clustering, Self Organizing Maps 

• Reinforcement Learning 

• Metric Learning 

For each method, examples from different fields such as Natural Language Processing, Bioinformatics, Computer Vision, and Medical Imaging will be presented. Some of the most important methods will accompanied by small projects for a better understanding of their advantages and limitations. 

Projects (capped at maximum 90 points total):



Project
Needs Programming
Points
Due
1 Decision Trees
Yes
10
09/05
2 Random Forest Yes
Yes
10
09/12
3 Logistic Regression
Yes
10
09/19
4 TISP
Yes
10
09/26
5 Weka  No
15 10/10
6 FSA regression and binary clf
Yes
10
10/17
7 FSA multi-class
Yes
15 10/24
8 Boosting
Yes
10
10/31
9 Neural Nets/CNN
Yes
10
11/07
10 HMM Yes  
Yes
12 11/21
11 Clustering
Yes
10
11/28
12 PCA
Yes
10
12/05

Grading: There will be 12 homework projects shown above worth at most 90 points, and random quizzes that are worth another 10 points for a total of 100 points. 

• For most projects students can choose what datasets to show results on, obtaining the specified points for that project. The 11 datasets that can be used in some of the projects are worth points depending on their difficulty, as shown below 

• The projects are worth at most 90 points. Students can choose which projects to work on to reach 90 points. If they obtain more than 90 points for the projects, only 90 points will be counted towards the final grade. 

Information on the datasets and their training and testing sets 

Dataset
 Type
Obs
Features
Train
Test
Points(d)
 Arcene
Binary clf
100+100
10000
train
valid
2
Dexter
Binary clf
300+300
20000
train
valid
2
Dorothea
Binary clf
800+350
100000
train
valid
2
Gisette
Binary clf
6000+1000
5000
train
valid
2
Hill-valley
Binary clf
606+606
100 
X,Y
Xtest,Ytest
1
 Madelon
Binary clf
2000
500 
train
valid
2
Miniboone
Binary clf
130k
50
4 fold cross-val
3
Covtype
Multi-class clf
580k
54
first 11,340 + next 3,780
last 565,892
1
 Poker
Multi-class clf
25k+1mil
10 
X,Y
Xtest,Ytest
1
Satimage
Multi-class
clf 4435+2000
36
 X,Y
Xtest,Ytest
1
Bike rental
Regression
11k+6.5k
10
train
test+online
2
 Online News
Regression
40k
58
4
fold cross-val
3

The following scheme will be used to convert the percentage points to letter grades 

[90, 93)
A-  
[93, 100]
A


[80, 83) 
B-
[83, 87)
B
[87, 90) 
B+
[70, 73)
C-
[73, 77)
C
[77, 80)
C+
[60, 63)
D-
[63, 67)
D
[67, 70)
D+
[0, 60)
F




Prerequisites: STA 3032 and knowledge of Matlab, R, Python, C++ or other programming language or consent of instructor.

Course Materials 

• CMU Machine Learning Class: http://www.cs.cmu.edu/~epxing/Class/10701/ 

• Trevor Hastie’s ML books: http://www.stanford.edu/~hastie/pub.htm 

• Tom Michell’s ML book website: http://www.cs.cmu.edu/~tom/mlbook.html 

• Nillson’s ML book: http://ai.stanford.edu/~nilsson/mlbook.html 

• Blackboard class website: go to http://campus.fsu.edu/ and login using you ACNS username and password. Homework, datasets, grades, course notes and other course material will be posted there. 

Course Policy 

• Classroom policies: The classroom environment is an important factor for effective learning. In order to not distract other students’ attention please follow these classroom policies. The first one of these is the university policy. 

- Remember that no food or drinks are allowed in the classroom. 

- Turn off all audible alarms (cell phones, pagers, calculators, watches etc.) 

- Do not use cell phones in the class. 

- Come to the class on time. Opening and closing the classroom door in the middle of a class cause distraction to the students and the teacher. 

- Do not talk to other students without permission while the professor is teaching. More than one conversation creates noise and makes it difficult for the students to pay attention to the lecture. 

• Homework: There will be 12 homework projects, due one to two weeks from the date they are announced. The homework must be neatly written, preferably typed and must be submitted online. Computer output should be kept to a minimum. You are encouraged to submit the project code by email. The code for best results for each homework will be posted on Blackboard to be available for all students attending the class. Students are allowed to work on the projects in teams of two (for graduate students) and three (for undergrads) and should submit a single homework for each team

• Code: It is acceptable to use code downloaded from the internet for the homework as long as a reference to the code website, package or the appropriate paper is added to the bibliography of the homework. 

• Collecting returned homework: It is the student’s responsibility to check grades on the Blackboard class page. If you notice any mistake in recording grades on the Blackboard page, please inform the instructor about it as soon as possible. 

• Homework re-grade: You have one week to request a re-grade of a homework from the date on which the graded homework is returned to the students of the class. For that, see the instructor along with the relevant homework. 

• Contacting the instructor outside the class: You are strongly encouraged to come to the instructor during his office hours. If your schedule conflicts with the office hours, you can make an appointment. You may ask the instructor brief questions by e-mail, but you may be asked to come to office hours if the instructor thinks that the questions are better answered in person. When you send e-mails remember the following: 

- Always e-mail from your FSU accounts. The e-mails from non-FSU accounts may not reach me due to filters. 

- Always write your full name at the end of each e-mail message you send. 

- Always write the course number at the beginning of the subject line. 

• University Attendance Policy: Excused absences include documented illness, deaths in the family and other documented crises, call to active military duty or jury duty, religious holy days, and official University activities. These absences will be accommodated in a way that does not arbitrarily penalize students who have a valid excuse. Consideration will also be given to students whose dependent children experience serious illness. 

• Academic honor policy: The Florida State University Academic Honor Policy outlines the University’s expectations for the integrity of students’ academic work, the procedures for resolving alleged violations of those expectations, and the rights and responsibilities of students and faculty members throughout the process. Students are responsible for reading the Academic Honor Policy and for living up to their pledge to “. . . be honest and truthful and . . . [to] strive for personal and institutional integrity at Florida State University.” (Florida State University Academic Honor Policy, found at http://dof.fsu.edu/honorpolicy.htm.) 

• Americans with Disabilities Act: 

Students with disabilities needing academic accommodation should: 

1) register with and provide documentation to the Student Disability Resource center; and 

2) bring a letter to the instructor indicating the need for accommodation and what type. 

This should be done during the first week of class. 

This syllabus and other class materials are available in alternative format upon request. 

For more information about services available to FSU students with disabilities, contact: 

Student Disability Resource Center 

874 Traditions Way 

108 Student Services Building 

Florida State University 

Tallahassee, FL 32306-4167 

(850) 644-9566 (voice) 

(850) 644-8504 (TDD) 

[email protected] 

http://www.disabilitycenter.fsu.edu/ 

• Syllabus Change Polic

Except for changes that substantially affect implementation of the evaluation (grading) statement, this syllabus is a guide for the course and is subject to change with advance notice.

发表评论

电子邮件地址不会被公开。 必填项已用*标注