Time: Mondays 12:00 - 3:00 PM | Place: CBIM 22 |

Instructor: Tina Eliassi-Rad | Office hours: Mondays 3:00 - 4:00 PM in CBIM 08 |

Course number: 16:198:598 | Credits: 3 |

**Prerequisites:** A previous course on machine learning or data mining. A strong knowledge of algorithms and programming (Java, C, and scripting/dynamic languages).

- Scaling up Machine Learning: Parallel and Distributed Approaches. Edited by Ron Bekkerman, Mikhail Bilenko, and John Langford. Cambridge University Press, December 30, 2011

- (textbook) Kevin Murphy, Machine Learning: A Probabilistic Perspective. ISBN 0262018020, MIT Press, 2012.
- (textbook) Christopher Bishop, Pattern Recognition and Machine Learning. ISBN 0387310738, Springer 2006.
- (textbook) Tom Mitchell, Machine Learning. ISBN 0070428077, McGraw-Hill, 1997.
- (textbook, free on-line) Trevor Hastie, Robert Tibshirani and Jerome Friedman, Elements of Statistical Learning. ISBN 0387952845, Springer, 2009 (2nd edition).
- (textbook, free on-line) David MacKay, Information Theory, Inference, and Learning Algorithms. ISBN 0521642981, Cambridge University Press, 2003.
- (textbook, free on-line) Roberto Battiti and Mauro Brunato. The LION Way: Machine Learning plus Intelligent Optimization. Lionsolver, Inc. 2013.
- Probability Review (David Blei, Princeton)
- Probability Theory Review (Arian Maleki and Tom Do, Stanford)
- Linear Algebra Tutorial (C.T. Abdallah, Penn)
- Linear Algebra Review and Reference (Zico Kolter and Chuong Do, Stanford)
- Statistical Data Mining Tutorials (Andrew Moore, Google/CMU)
- Theoretical CS Cheat Sheet (Princeton)

- We will use the class sakai site for announcements, assignments, and your contributions.
- When emailing me about the course, begin the subject line with [f14 cs598].
- For your hadoop-based jobs, you can use the DCS hadoop cluster. For big non-hadoop jobs, you can use aurora.cs. If you don't have accounts on these machine, let me know.
- Course projects must be done individually.
- Any regrading request must be submitted in writing and within one week of the returned material. The request must detail precisely and concisely the grading error.
- Refresh your knowledge of the university's academic integrity policy and plagiarism. There is zero-tolerance for cheating!

- Machine Learning with Large Datasets by William Cohen (taught Spring 2012, Spring 2013 and Spring 2014)
- Big Data: Large Scale Machine Learning by John Langford and Yann LeCun (taught Spring 2013)
- Machine Learning for Big Data / Statistics for Big Data by Carlos Guestrin and Emily Fox (taught Winter 2013)