Author: John Case
Source: SIAM Journal on Computing Vol. 28, No. 6, 1999, 1941-1969.
Abstract. Some extensions are considered of Gold's influential model of language learning by machine from positive data. Studied are criteria of successful learning featuring convergence in the limit to vacillation between several alternative correct grammars. The main theorem of this paper is that there are classes of languages that can be learned if convergence in the limit to up to (n + 1) exactly correct grammars is allowed but which cannot be learned if convergence in the limit is to no more than n grammars, where the no more than n grammars can each make finitely many mistakes. This contrasts sharply with results of Barzdin and Podnieks and, later, Case and Smith for learnability from both positive and negative data.
A subset principle from a 1980 paper of Angluin is extended to the vacillatory and other criteria of this paper. This principle provides a necessary condition for avoiding overgeneralization in learning from positive data. It is applied to prove another theorem to the effect that one can optimally eliminate half of the mistakes from final programs for vacillatory criteria if one is willing to converge in the limit to infinitely many different programs instead.
Child language learning may be sensitive to the order or timing of data
presentation. It is shown, though, that for the vacillatory success criteria
of this paper, there is no loss of learning power for
machines which are insensitive to order in several ways
simultaneously. For example, partly set-driven machines attend
only to the set and length of sequence of positive data,
not the actual sequence itself. A machine M is weakly
n-ary order independent
for each language
©Copyright 1999, Society for Industrial and Applied Mathematics.