Spring 2019: CS 6220 -- Data Mining
Techniques, CRN 33181 cross-listed with
Spring 2019: DS 5230 -- Unsupervised
Machine Learning and Data Mining, CRN 34643
Lecture time: Tuesdays & Fridays
|
Place: International
Village, Room 019
|
Instructor: Tina Eliassi-Rad
|
Office hours: Tuesdays 3:30 – 5:00 PM in International
Village, Room 016
|
TA: Deeksha Doddahonnaiah
|
Office hours:
Thursdays 3:30 – 5:00 PM in West
Village F, Room 118
Also, available by appointment. Email doddahonnaiah.d [at] husky [dot]
neu [dot] edu; begin the subject line with [sp19
dm].
|
TA: Hui “Sophie” Wang
|
Office hours:
Wednesdays 10:00 – 11:30 AM in Hastings
Hall at the YMCA, Room 105
Also, available by
appointment. Email wang.hui1 [at] husky [dot] neu [dot] edu; begin the
subject line with [sp19 dm].
|
This 4-credit graduate-level course covers data mining and unsupervised learning. Its prerequisites are:
This course does not have a designated textbook. The readings are assigned in the syllabus (see below). Here are some textbooks (all optional) related to the course.
Lec # |
Date |
Topic |
Readings &
Notes |
1 |
T 1/8 |
Introduction and Overview |
o Chapter
1 of http://eliassi.org/mmds-book-v2L.pdf
o http://www.cs.cmu.edu/~tom/pubs/MachineLearning.pdf
|
2 |
F 1/11 |
Review of Linear
Algebra |
o Linear Algebra Tutorial (C.T. Abdallah, Penn) o Linear Algebra Review and Reference (Zico Kolter and Chuong Do, Stanford) o Probability Review (David Blei, Princeton) o Probability Theory Review (Arian Maleki and Tom Do, Stanford) |
3 |
T 1/15 |
Density Estimation |
o http://ned.ipac.caltech.edu/level5/March02/Silverman/Silver_contents.html
o http://eliassi.org/Sheather_StatSci_2004.pdf
o Optional:
Sections 6.6-6.9 of http://statweb.stanford.edu/~tibs/ElemStatLearn/printings/ESLII_print10.pdf
|
4 |
F 1/18 |
Frequent Itemsets & Association Rules |
o Chapter
6 of http://eliassi.org/mmds-book-v2L.pdf o Optional:
Sections 6.1-6.6 of http://www-users.cs.umn.edu/~kumar/dmbook/ch6.pdf
|
5 |
T 1/22 |
Frequent Itemsets & Association Rules |
o Chapter
6 of http://eliassi.org/mmds-book-v2L.pdf o Optional:
Sections 6.1-6.6 of http://www-users.cs.umn.edu/~kumar/dmbook/ch6.pdf
|
Homework #1 o out
on Tuesday January 22 o due
on Friday February 1 at 11:59 PM Eastern o graded by Friday February 15 |
|||
6 |
F 1/25 |
Finding Similar Items |
o Chapter
3 of http://eliassi.org/mmds-book-v2L.pdf |
7 |
T 1/29 |
Finding Similar Items |
o Chapter
3 of http://eliassi.org/mmds-book-v2L.pdf |
8 |
F 2/1 |
Mining Data Streams |
o Chapter
4 of http://eliassi.org/mmds-book-v2L.pdf |
9 |
T 2/5 |
Mining Data Streams |
o Chapter
4 of http://eliassi.org/mmds-book-v2L.pdf |
10 |
F 2/8 |
Mining Data Streams |
o Chapter
4 of http://eliassi.org/mmds-book-v2L.pdf |
Homework #2 o out
on Friday February 8 o due
on Monday February
18 at 11:59 PM Eastern o graded
by Friday March 1 |
|||
11 12 |
T 2/12 F 2/15 |
Dimensionality
Reduction (PCA, SVD, CUR, |
o Chapter
11 of http://eliassi.org/mmds-book-v2L.pdf o
Section 14.5 of http://statweb.stanford.edu/~tibs/ElemStatLearn/printings/ESLII_print10.pdf
|
13 14 |
T 2/19 F 2/22 |
Clustering: |
o Chapter
9 of http://robotics.stanford.edu/~nilsson/MLBOOK.pdf
o Sections
7.1-7.3 of http://eliassi.org/mmds-book-v2L.pdf o Chapter
8 of https://www.cs.cornell.edu/jeh/book2016June9.pdf o Section
14.3 of http://statweb.stanford.edu/~tibs/ElemStatLearn o http://cs229.stanford.edu/notes/cs229-notes7b.pdf o http://cs229.stanford.edu/notes/cs229-notes8.pdf o Optional:
https://www.cs.rutgers.edu/~mlittman/courses/lightai03/jain99data.pdf
o Optional:
http://web.itu.edu.tr/sgunduz/courses/verimaden/paper/validity_survey.pdf o Optional:
http://www.dbs.ifi.lmu.de/Publikationen/Papers/KDD-96.final.frame.pdf
|
Homework #3 o out
on Tuesday February 19 o
due on Friday March 1 at 11:59 PM
Eastern o graded by Friday March 15 |
|||
15 |
T 2/26 |
EM, K-mediods, Hierarchical Clustering, Evaluation Metrics and Practical Issues |
o Same
readings as for 10/16 & 10/19 |
16 |
F 3/1 |
Midterm Exam |
Graded by Tuesday March 19 |
-- |
T 3/5 F 3/8 |
Spring Break |
|
17 |
T 3/12 |
Project Proposal Pitches (in-class) |
Graded by Tuesday March 19 |
18 19 |
F 3/15 T 3/19 |
Spectral Clustering |
o http://ai.stanford.edu/~ang/papers/nips01-spectral.pdf o http://www.cs.columbia.edu/~jebara/4772/papers/Luxburg07_tutorial.pdf |
Homework #4 o out
on Tuesday March 19 o
due on Friday March 29 at 11:59 PM
Eastern o graded by Friday April 12 |
|||
20 |
F 3/22 |
Recommendation Systems |
o Chapter
9 of http://eliassi.org/mmds-book-v2L.pdf |
21 |
T 3/26 |
Recommendation Systems |
o Chapter
9 of http://eliassi.org/mmds-book-v2L.pdf |
22 |
F 3/29 |
Matrix Factorization |
o Chapter
14.6 of http://statweb.stanford.edu/~tibs/ElemStatLearn/ o
http://papers.nips.cc/paper/1861-algorithms-for-non-negative-matrix-factorization.pdf
o Optional:
http://www.sandia.gov/~tgkolda/pubs/pubfiles/TensorReview.pdf
|
Homework #5 o out
on Friday March 29 o
due on Tuesday April 9 at 11:59 PM
Eastern o graded by Tuesday April 16 |
|||
23 |
T 4/2 |
Social Bots (Guest lecturer: |
o https://arxiv.org/abs/1407.5225 |
24 |
F 4/5 |
Link Analysis |
o Chapter
5 of http://eliassi.org/mmds-book-v2L.pdf o Optional: http://bit.ly/2iYxo82
|
25 |
T 4/9 |
Review for Final |
|
26 |
F 4/12 |
Final Exam |
Graded by Friday April 26 |
27 |
T 4/16 |
Project Presentations: Group 1 |
|
28 |
F 4/19 |
Project Presentations: Group 2 |
Last class day |
Project reports o due
on Tuesday April 23 at 11:59 PM o graded
by Sunday April 28 |
|||
Final grades are due to the
Registrar Office on Monday April 29 at 9:00 AM Eastern. |
A |
93-100 |
A- |
90-92 |
B+ |
87-89 |
B |
83-86 |
B- |
80-82 |
C+ |
77-79 |
C |
73-76 |
C- |
70-72 |
F |
< 70 |