What can 20,000 models teach us? Lessons learned from a large-scale empirical comparison of learning algorithms.

Dr. Alexandru Niculescu-Mizil, NEC Labs America

Wednesday April 11, 2012 at 13:30 PM in CBIM 22 (Multipurpose Room)

Faculty Host: Tina Eliassi-Rad

Abstract: In this talk, I will share some of the insights we gained from a large-scale comparison of binary classification algorithms. In this study, we evaluated some of the most popular learning algorithms using a variety of metrics that emphasize different performance aspects: accuracy at a set threshold, ability to rank positive cases higher than negative ones, or ability to predict well-calibrated probabilities. Besides addressing the obvious questions that arise with such a comparison (is there a "best" learning algorithm? are SVMs better than neural networks? do newly developed algorithms like boosting and SVM really provide an improvement in practice?), I will discuss a few unexpected and, I would say, more exciting findings from the study. I will look in depth at the ability of learning algorithms to produce well-calibrated probabilistic predictions. An analysis of the predictions of the various models shows that some learning algorithms introduce a surprisingly consistent distortion that makes their predictions poorly calibrated. I will show how this distortion can be repaired by re-calibrating the models after training, and compare two re-calibration techniques. I will also address the question of what performance metric should be used for model selection. Conventional wisdom says that one should use the metric one is interested in optimizing, but the results indicate that this is not true for small sample sizes. In fact, for small sample sizes, there seems to be a single "uber-metric" that one should use in model selection regardless of what the "correct" metric is. Finally, time permitting, I will show how performance can be improved by using Ensemble Selection to combine predictions from models trained by the different learning algorithms.

Joint work with Rich Caruana, Art Munson, David Skalak, Tom Fawcett, Geoff Crew, Alex Ksikes, and Cristi Bucila.

Bio: Alexandru Niculescu-Mizil is a researcher at NEC Laboratories America. Before joining NEC, he was a Herman Goldstine Postdoctoral Fellow at IBM T.J. Watson Research Center. He received his Ph.D. from Cornell University in 2008 under the supervision of Rich Caruana, a Masters of Science degree in Computer Science from Cornell University, and a Magna Cum Laude Bachelors degree in Mathematics and Computer Science from University of Bucharest. His research interests are in machine learning and data mining, particularly in inductive transfer, graphical model structure learning, probability estimation, empirical evaluations, ensemble methods and on-line learning. He received an ICML Distinguished Student Paper Award in 2005 for his work on probability estimation, and a COLT Best Student in 2008 paper award for his work on on-line learning. In 2009, he led the IBM Research team that won the KDDCUP Competition.

Suggested readings:

  • Rich Caruana, Art Munson, and Alexandru Niculescu-Mizil. Getting the Most Out of Ensemble Selection. IEEE International Conference on Data Mining (ICDM), 2006.
  • Rich Caruana, Alexandru Niculescu-Mizil. An Empirical Comparison of Supervised Learning Algorithms. International Conference on Machine Learning (ICML), 2006.