Synthesizing Learners Tolerating Computable Noisy Data

Authors: John Case and Sanjay Jain

Source: Lecture Notes in Artificial Intelligence Vol. 1501, 1998, 205 - 219.

Abstract. An index for an r.e. class of languages (by definition) generates a sequence of grammars defining the class. An index for an indexed family of languages (by definition) generates a sequence of decision procedures defining the family.
F. Stephan's model of noisy data is employed, in which, roughly, correct data crops up infinitely often, and incorrect data only finitely often.
In a completely computable universe, all data sequences, even noisy ones, are computable. New to the present paper is the restriction that noisy data sequences be, nonetheless, computable!
Studied, then, is the synthesis from indices for r.e. classes and for indexed families of languages of various kinds of noise-tolerant language-learners for the corresponding classes or families indexed, where the noisy input data sequences are restricted to being computable.
Many positive results, as well as some negative results, are presented regarding the existence of such synthesizers.
The main positive result is surprisingly more positive than its analog in the case the noisy data is not restricted to being computable: grammars for each indexed family can be learned behaviorally correctly from computable, noisy, positive data! The proof of another positive synthesis result yields, as a pleasant corollary, a strict subset-principle or tell-tale style characterization, for the computable noise-tolerant behaviorally correct learnability of grammars from positive and negative data, of the corresponding families indexed.

©Copyright 1998 Springer