David M. Blei

PROFESSOR OF COMPUTER SCIENCE AND STATISTICS

1255 Amsterdam Avenue
Room 1005 SSW
Mail Code 4690

Tel(212) 854-5450

David Blei's main research interest lies in the fields of machine learning and Bayesian statistics. Prof. Blei and his group develop novel models and methods for exploring, understanding, and making predictions from the massive data sets that pervade many fields.  Their work is widely used in science, scholarship, and industry to solve interdisciplinary, real-world problems. In particular, they focus on a variety of applications, including language, recommendation systems, neuroscience, and the computational social sciences.  Prof. Blei and his group have set new paths in the fields of machine learning and artificial intelligence. 

Research Interests

Topic Models, Probabilistic Modeling. Approximate Bayesian Inference.

By bringing together ideas in computer science, statistics, and optimization, more than a decade ago, Blei and collaborators developed a method to discover the abstract “topics” that pervade a collection of documents. Today, their algorithm—latent Dirichlet allocation (LDA)—is a standard method for topic discovery, and is used in many downstream tasks.  Since then, Blei and his group has significantly expanded the scope of topic modeling. One recent example is collaborative topic models, which connect textual content to user behavior (such as clicks), and which can be used to interpret patterns of readership, recommend documents, characterize readers, and organize collections according to both content and consumption.  (This algorithm is used by the New York Times to form recommendations for its readers.)  In addition to working on topic models, Blei and his group have created generic algorithms for scaling a wide class of statistical models to massive data sets.  Their work on variational inference has changed the scale at which we can apply sophisticated methods for data science and machine learning.

Blei earned his Bachelor's degree in Computer Science and Mathematics from Brown University (1997) and his PhD in Computer Science from the University of California, Berkeley (2004). Before joining Columbia, he was an Associate Professor of Computer Science at Princeton University (2006-2014).

RESEARCH EXPERIENCE

  • Postdoctoral Fellow, Department of Machine Learning, Carnegie Mellon University, 2004–2006  Advisor: John Lafferty
 

PROFESSIONAL EXPERIENCE

  • Professor, Departments of Statistics and Computer Science, Columbia University, 2014
  • Associate Professor, Department of Computer Science, Princeton University, 2011–2014 
  • Assistant Professor, Department of Computer Science, Princeton University, 2006–2011 

PROFESSIONAL AFFILIATIONS

  • Association of Computing Machinery 
  • Institute for Mathematical Statistics 
  • American Statistical Association 
  • Bernoulli Society

HONORS & AWARDS

  • Guggenheim Fellowship, 2017 
  • Fellow of the Institute for Mathematical Statistics, 2017
  • ICML Test of Time Award (for “Dynamic Topic Models”), 2016
  • Presidential Award for Outstanding Teaching, Honorable Mention, 2016 
  • Fellow of the Association of Computing Machinery, 2015 
  • SIGIR Test of Time Award Honorable Mention (for “Modeling Annotated Data”), 2015 
  • ACM Prize in Computing, 2013 
  • Blavatnik Award for Young Scientists: Faculty Winner, 2013 P
  • Presidential Early Career Award for Scientists and Engineers (PECASE), 2011 
  • Office of Naval Research Young Investigator Award, 2011 
  • Alfred P. Sloan Fellowship, 2010

SELECTED PUBLICATIONS

  • D. Blei, A. Kucukelbir, and J. McAuliffe.   Variational inference: A review for statisticians.  Journal of the American Statistical Association, to appear.   [arXiv]
  • P. Gopalan, W. Hao, D. Blei, and J. Storey. Scaling probabilistic 
  • models of genetic variation to millions of humans.   Nature Genetics, 48:1587-1590.   [nature] [biorXiv]
  • R. Ranganath, L. Tang, L. Charlin, and D. Blei.   Deep exponential families.   Artificial Intelligence and Statistics, 2015.   [PDF]
  • D. Blei.   Build, compute, critique, repeat: Data analysis with latent variable models.   Annual Review of Statistics and Its Applicaton 1:203-232, 2014.   [PDF]
  • P. Gopalan and D. Blei.   Efficient discovery of overlapping communities in massive networks.   Proceedings of the National Academy of Sciences, 110 (36) 14534-14539, 2013.   [PNAS]
  • D. Blei.   Probabilistic topic models.   Communications of the ACM, 55(4):77–84, 2012.   [PDF]
  • M. Hoffman, D. Blei, J. Paisley, and C. Wang.   Stochastic variational inference.   Journal of Machine Learning Research, 14:1303-1347, 2013.   [PDF]
  • D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, January 2003. [A shorter version appeared in NIPS 2002]. [PDF] [Code]