CSE Seminar: Alexander Gray
- September 30, 2011 2:00 pm - 3:30 pm
- Klaus 2447
By: Alexander Gray, Associate Professor
Computational Science and Engineering, College of Computing, Georgia Tech
Date: Friday, September 23, 2011
Time: 2:00pm - 3:30pm
Location: Klaus 2447
For more information please contact Dr. Alex Gray at agray [at] cc [dot] gatech [dot] edu
Techniques for Massive-Data Machine Learning
Starting with motivations from data analysis problems in astronomy as examples, we'll consider the task of making state-of-the-art machine learning methods scale to massive datasets (including n-point correlation functions, kernel density estimation, minimum spanning trees, bipartite matching, nonparametric Bayes classifiers, support vector machines, Nadaraya-Watson regression, kernel conditional density estimation, Gaussian process regression, nearest-neighbors, principal component analysis, hierarchical clustering, and manifold learning), despite their often quadratic or cubic scaling with the number of data, via seven different types of computational techniques: indexing, functional transforms, sampling, problem reductions, locality, parallelism, and active learning.
Alexander Gray received bachelor's degrees in Applied Mathematics and Computer Science from the University of California, Berkeley and a PhD in Computer Science from Carnegie Mellon University, and is currently an Associate Professor in the College of Computing at Georgia Tech. His group of approximately 20 researchers, the FASTlab, aims to comprehensively scale up all of the major practical methods of machine learning to massive datasets as well as develop new statistical methodology and theory, guided by challenge problems in cosmology, medicine, and other application areas. He began working with massive scientific datasets in 1993 (long before the current fashionable talk of “big data”) at NASA's Jet Propulsion Laboratory in its Machine Learning Systems Group. High-profile applications of his large-scale ML algorithms have been described in staff written articles in Science and Nature, including contributions to work selected by Science as the Top Scientific Breakthrough of 2003. He has won or been nominated for a number of best paper awards in statistics and data mining and is a recipient of the National Science Foundation CAREER Award in 2009. He gives invited tutorial lectures on massive-scale data analysis at the top data analysis research conferences, government agencies, and corporations, and is a member of the prestigious National Academy of Sciences Committee on the Analysis of Massive Data.