Time: Mondays & Thursdays 12:00-1:20 PM | Place: LIV RC-3 |

Instructor: Tina Eliassi-Rad | Office hours: Mondays 2:00-3:00 PM in CBIM 08 |

TA: Chetan Tonde | TA office hours: Mondays 3:00-5:00 PM in CBIM 22 |

Also, available by appointment. Email cjtonde [at] cs [dot] rutgers [dot] edu to setup appointment. | |

Course number: 16:198:535 | Credits: 3 |

**Prerequisites:** Calculus and linear algebra. An introductory course on statistics and probability. Algorithms and programming (MATLAB).

- Trevor Hastie, Robert Tibshirani, Jerome Friedman. Elements of Statistical Learning. ISBN 0387952845. (free online)
- Anand Rajaraman, Jurij Leskovec, and Jeffrey Ullman. Mining of Massive Datasets. v2.1, Cambridge University Press. 2014. (free online)
- Christopher Bishop. Pattern Recognition and Machine Learning. ISBN 0387310738.
- Tom Mitchell. Machine Learning. ISBN 0070428077.
- Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. ISBN 0262018020.
- Peter Flach. Machine Learning: The Art and Science of Algorithms that Make Sense of Data. ISBN 1107422221.
- David J. Hand, Heikki Mannila, Padhraic Smyth. Principles of Data Mining. ISBN 026208290X.
- Jiawei Han, Micheline Kamber, Jian Pei. Data Mining: Concepts and Techniques, Third Edition. ISBN 0123814790.
- Pang-Ning Tan, Michael Steinbach, Vipin Kumar. Introduction to Data Mining. ISBN 0321321367.

- Mathworks Matlab Tutorials
- Ben Taskar's Matlab Tutorial
- Probability Review (David Blei, Princeton)
- Probability Theory Review (Arian Maleki and Tom Do, Stanford)
- Linear Algebra Tutorial (C.T. Abdallah, Penn)
- Linear Algebra Review and Reference (Zico Kolter and Chuong Do, Stanford)
- Statistical Data Mining Tutorials (Andrew Moore, Google/CMU)
- Theoretical CS Cheat Sheet (Princeton)

- Homework assignments (2×15%)
- Midterm exam (30%)
- Class project (40%)
- Proposal report (10%) -- 2 pages plus 5-minute pitch Should include answers to the following questions:
- What is the problem?
- Why is it interesting and important?
- Why is it hard? Why have previous approaches failed?
- What are the key components of your approach?
- What data sets and metrics will be used to validate the approach?
- In-class presentation (12%)
- Final report (18%) -- 6 pages max
- For guidance on writing the final report, see slide 70 of Eamonn Keogh's KDD'09 Tutorial on How to do good research, get it published in SIGKDD and get it cited!
- Follow ACM formatting guidelines

- We will use the class sakai site for announcements, assignments, and your contributions.
- When emailing me or the TA about the course, begin the subject line with [f15 cs535].
- Programming exercises will be in MATLAB. Rutgers holds a site license for MATLAB. You can download MATLAB to your computer from the university's software portal. MATLAB is also installed on the CS machines. Just type "matlab" at the prompt.
- Homeworks must be done individually. Late homeworks are accepted up to
**2 days**after the deadline. A penalty of 20% be charged for each late day. - The class project can be done either individually or in groups of two.
- For your class project, you can use whatever programming language that you like.
- Any regrading request must be submitted in writing and within one week of the returned material. The request must detail precisely and concisely the grading error.
- Refresh your knowledge of the university's academic integrity policy and plagiarism. There is zero-tolerance for cheating!
- Letter grades will be assigned based on Rutgers Graduate Grade Scale, which is as follows:

A in [90, 100] B+ in [85, 89.99] B in [80, 84.99] C+ in [75, 79.99] C in [70, 74.99] F in [0, 69.99]