Author: Kenji Yamanishi
Email: amanisi@ccm.cl.nec.co.jp
Source: Information & Computation Vol. 150, No. 1, 1999, 25-56.
Abstract. This paper addresses the issue of designing an effective distributed learning system in which a number of agent learners estimate the parameter specifying the target probability density in parallel and the population learner (for short, the p-learner) combines their outputs to obtain a significantly better estimate. Such a system is important in speeding up learning. We propose as distributed learning systems two types of thedistributed cooperative Bayesian learning strategies(DCB), in which each agent learner or the p-learner employs a probabilistic version of the Gibbs algorithm. We analyze DCBs by giving upper bounds on their average logarithmic losses for predicting probabilities of unseen data as functions of the sample size and the population size. We thereby demonstrate the effectiveness of DCBs by showing that for some probability models, they work approximately (or sometimes exactly) as well as the nondistributed optimal Bayesian strategy, achieving a significant speed-up of learning over it. We also consider the case where the hypothesis class of probability densities is hierarchically parameterized, and there is a feedback of information from the p-learner to agent learners. In this case we propose another type of DCB based on the Markov chain Monte Carlo method, which we abbreviate as HDCB, and characterize its average prediction loss in terms of the number of feedback iterations as well as the population size and the sample size. We thereby demonstrate that for the class of hierarchical Gaussian distributions HDCB works approximately as well as the nondistributed optimal Bayesian strategy, achieving a significant speed-up of learning over it.
©Copyright 1999 Academic Press.
*An extended abstract of this paper appeared in "Proceedings of the 10th Annual Conference on Computational Learning Theory."