Data-Driven Discovery Using Probabilistic Hidden Variable Models

Author: Padhraic Smyth

Source: Algorithmic Learning Theory, 17th International Conference, ALT 2006, Barcelona, October 2006, Proceedings, (José L. Balcázar, Phil Long and Frank Stephan, Eds.), Lecture Notes in Artificial Intelligence 4264, pp. 28, Springer 2006.

Abstract. Generative probabilistic models have proven to be a very useful framework for machine learning from scientific data. Key ideas that underlie the generative approach include (a) representing complex stochastic phenomena using the structured language of graphical models, (b) using latent (hidden) variables to make inferences about unobserved phenomena, and (c) leveraging Bayesian ideas for learning and prediction. This talk will begin with a brief review of learning from data with hidden variables and then discuss some exciting recent work in this area that has direct application to a broad range of scientific problems. A number of different scientific data sets will be used as examples to illustrate the application of these ideas in probabilistic learning, such as time-course microarray expression data, functional magnetic resonance imaging (fMRI) data of the human brain, text documents from the biomedical literature, and sets of cyclone trajectories.

©Copyright 2006, Springer