Stochastic Finite Learning of the Pattern Languages

Authors: Peter Rossmanith and Thomas Zeugmann

Source: Machine Learning Vol. 44, No. 1/2, 2001, 67-91. (Special Issue on Automata Induction, Grammar Inference, and Language Acquisition, edited by Vasant Honavar and Colin de la Higuera)

Abstract. The present paper proposes a new learning model - called stochastic finite learning - and shows the whole class of pattern languages to be learnable within this model.

This main result is achieved by providing a new and improved average-case analysis of the Lange-Wiehagen (1991) algorithm learning the class of all pattern languages in the limit from positive data. The complexity measure chosen is the total learning time, i.e., the overall time taken by the algorithm until convergence. The expectation of the total learning time is carefully analyzed and exponentially shrinking tail bounds for it are established for a large class of probability distributions. For every pattern p containing k different variables it is shown that Lange and Wiehagen's algorithm possesses an expected total learning time of O(akE[L]log1/b(k)), where a and b are two easily computable parameters arising naturally from the underlying probability distributions, and E[L] is the expected example string length.

Finally, assuming a bit of domain knowledge concerning the underlying class of probability distributions, it is shown how to convert learning in the limit into stochastic finite learning.

©Copyright 2001 Kluwer Academic Publishers.