Data-Driven Discovery using Probabilistic Hidden Variable Models
(invited lecture for DS 2006)
Author: Padhraic Smyth
Affiliation: Department of Computer Science,
University of California, Irvine, U.S.A.
Generative probabilistic models have proven to be a very
useful framework for machine learning from scientific data.
Key ideas that underlie the generative approach include
(a) representing complex stochastic phenomena using the structured
language of graphical models,
(b) using latent (hidden) variables
to make inferences about unobserved phenomena, and
Bayesian ideas for learning and prediction.
This talk will begin
with a brief review of learning from data with hidden variables
and then discuss some exciting recent work in this area that has
direct application to a broad range of scientific problems. A number
of different scientific data sets will be used as examples to
illustrate the application of these ideas in probabilistic
learning, such as time-course microarray expression data,
functional magnetic resonance imaging (fMRI) data of the human
brain, text documents from the biomedical literature, and sets of
©Copyright 2006 Author