ALT 2015

Invited Speakers

The invited speakers are Inderjit Dhillon ("Bilinear Prediction using Low Rank Models"), Sham M. Kakade ("Finding Hidden Structure in Data with Tensor Decompositions"), Cynthia Rudin and Kiri Wagstaff.

Inderjit Dhillon Bilinear Prediction using Low Rank Models.
Abstract: Linear prediction methods, such as linear regression and classification, form the bread-and-butter of modern machine learning. The classical scenario is the presence of data with multiple features and a single target variable. However, there are many recent scenarios, where there are multiple target variables. For example, predicting bid words for a web page (where each bid word acts as a target variable), or predicting diseases linked to a gene. In many of these scenarios, the target variables might themselves be associated with features. In these scenarios, we propose the use of bilinear prediction with low-rank models. The low-rank models serve a dual purpose: (i) they enable tractable computation even in the face of millions of data points as well as target variables, and (ii) they exploit correlations among the target variables, even when there are many missing observations. We illustrate our methodology on two modern machine learning problems: multi-label learning and inductive matrix completion, and show results on two applications: predicting Wikipedia labels, and predicting gene-disease relationships.
This is joint work with Prateek Jain, Nagarajan Natarajan, Hsiang-Fu Yu and Kai Zhong.
Biography: Inderjit Bhillon received his PhD from the University of California, Berkeley and works at the Department of Computer Science of the University of Textas at Austin. He is an IEEE Fellow and SIAM Fellow. His main research interests are in machine learning, data mining and bioinformatics. His emphasis is on developing novel algorithms that respect the underlying problem structure and are scalable to massive data sets. Some of his current research topics are high-dimensional data analysis, divide-and-conquer methods for big data analytics, social network analysis and predicting gene-disease associations.

Sham M. Kakade: Finding Hidden Structure in Data with Tensor Decompositions.
Abstract: In many applications, we face the challenge of modeling the interactions between multiple observations. A popular and successful approach in machine learning and AI is to hypothesize the existence of certain latent (or hidden) causes which help to explain the correlations in the observed data. The (unsupervised) learning problem is to accurately estimate a model with only samples of the observed variables. For example, in document modeling, we may wish to characterize the correlational structure of the "bag of words" in documents, or in community detection, we wish to discover the communities of individuals in social networks. Here, a standard model is to posit that documents are about a few topics (the hidden variables) and that each active topic determines the occurrence of words in the document. The learning problem is, using only the observed words in the documents (and not the hidden topics), to estimate the topic probability vectors (i.e. discover the strength by which words tend to appear under different topcis). In practice, a broad class of latent variable models is most often fit with either local search heuristics (such as the EM algorithm) or sampling based approaches.
This talk will discuss a general and (computationally and statistically) efficient parameter estimation method for a wide class of latent variable models---including Gaussian mixture models (for clustering), hidden Markov models (for time series), and latent Dirichlet allocation (for topic modeling and community detection) ---by exploiting a certain tensor structure in their low-order observable moments. Specifically, parameter estimation is reduced to the problem of extracting a certain decomposition of a tensor derived from the (typically second- and third-order) moments; this particular decomposition can be viewed as a natural generalization of the (widely used) principal component analysis method.
Biography: Sham Kakade is currently a principal research scientist at Microsoft Research New England. This fall, Sham Kakade will arrive at the University of Washington to take up a joint appointment with Computer Science & Engineering and Statistics as the Washington Research Foundation Data Science Chair. He completed his Ph.D. at the Gatsby Computational Neuroscience Unit at University College London, advised by Peter Dayan, and earned his B.S. in physics at Caltech. Before joining Microsoft Research, Sham Kakade was an associate professor at the Department of Statistics, Wharton, University of Pennsylvania (2010-2012) and an assistant professor at the Toyota Technological Institute at Chicago (2005-2009). Before this, he completed a postdoc in the Computer and Information Science Department at the University of Pennsylvania under the supervision of Michael Kearns.
He works in the area broadly construed as data science, focusing on designing (and implementing) both statistically and computationally efficient algorithms for machine learning, statistics, and artificial intelligence. His intent is to see these tools advance the state of the art on core scientific and technological problems. Sham has made contributions in various areas including statistics, optimization, probability theory, machine learning, algorithmic game theory and economics, and computational neuroscience. Notably, along with various collaborators, a line of his work has focused on developing computationally efficient estimation methods (based on spectral methods) for settings with hidden (or latent) structure; such problems involve estimating topics in documents, clusters of points, or communities in social networks. Sham Kakade is also actively working on applied problems in both computer vision and natural language processing. Part of these latter efforts have involved empirical studies of deep learning methods. Recently, Sham received the 2014 INFORMS Revenue Management and Pricing Best Paper Award, which is given for the best contribution to the science of pricing and revenue management, for a paper published in English in the last five years.

Cynthia Rydin is an associate professor of statistics at the Massachusetts Institute of Technology associated with the Computer Science and Artificial Intelligence Laboratory and the Sloan School of Management, and directs the Prediction Analysis Lab. Her interests are in machine learning, data mining, applied statistics, and knowledge discovery (Big Data). Her application areas are in energy grid reliability, healthcare, and computational criminology. Previously, Prof. Rudin was an associate research scientist at the Center for Computational Learning Systems at Columbia University, and prior to that, an NSF postdoctoral research fellow at NYU. She holds an undergraduate degree from the University at Buffalo where she received the College of Arts and Sciences Outstanding Senior Award in Sciences and Mathematics, and three separate outstanding senior awards from the departments of physics, music, and mathematics. She received a PhD in applied and computational mathematics from Princeton University. She is the recipient of the 2013 INFORMS Innovative Applications in Analytics Award, an NSF CAREER award, and was named as one of the "Top 40 Under 40" by Poets and Quants in 2015. Her work has been featured in Businessweek, The Wall Street Journal, the New York Times, the Boston Globe, the Times of London, Fox News (Fox & Friends), the Toronto Star, WIRED Science, U.S. News and World Report, Slashdot, CIO magazine, Boston Public Radio, and on the cover of IEEE Computer. She is presently the chair-elect for the INFORMS Data Mining Section, and currently serves on committees for DARPA, the National Academy of Sciences, the US Department of Justice, and the American Statistical Association.

Kiri Wagstaff is

a Researcher in the Machine Learning and Instrument Autonomy Group, investigating ways that machine learning can be used to increase the autonomy of space missions
and a Tactical planner and uplink lead for the Mars Exploration Rover Opportunity;

both at the Jet Propulsioin Laboratory in Pasadena, California. She received her Ph.D. in Computer Science from the Cornell University in 2002 and her M.S. in Geological Sciences at the University of Southern California in 2008. Her background is in Computer Science, Planetary Science and Geology. She is most interested in problems that lie at the interfaces between these fields, such as automated methods (artificial intelligence, machine learning) to investigate science questions using planetary data (orbital and in situ).

Algorithmic Learning Theory (ALT) 2015

Invited Speakers

Banff, Canada, 4–6 October 2015