Inductive inference of unbounded unions of pattern languages from positive data

Authors: Takeshi Shinohara and Hiroki Arimura.
Email: shino@ai.kyutech.a.jp

Source: Theoretical Computer Science Vol. 241, No. 1-2, 2000, 191-209.

Abstract. A pattern is a string consisting of constant symbols and variables. The language of a pattern is the set of constant strings obtained by substituting nonempty constant strings for variables in the pattern. In this paper we consider the inductive inference of unbounded unions of pattern languages from positive data. For any fixed k, the class of unions of at most k pattern languages is already shown to be inferable from positive data. The class of unbounded unions is not inferable, because any pattern with at least one variable defines an infinite language and any constant string defines a singleton set consisting of itself, and therefore, the class of unions becomes super-finite, that is, it contains all the finite languages and at least one infinite language. In this paper, we consider several restrictions on patterns to investigate the inferability of unbounded unions of their languages from positive data. A proper pattern contains at least one variable. A regular pattern contains at most one occurrence of every variable. Although the class of unbounded unions of proper pattern languages is not super-finite, it is shown not to be inferable from positive data, even if patterns are restricted to be regular or one-variable. When regular patterns are restricted to be of the form ``xwy'', where x and y are variables and w is a constant string, the class of unbounded unions is shown to be inferable from positive data. When regular patterns do not contain more than l consecutive occurrences of constant symbols, for some fixed l, the class of unbounded unions is shown to be inferable from positive data. Extended pattern languages, where substitutions of the empty string for some variables are allowed, are also considered from the viewpoint of relationship to the inferability of unbounded unions of non-extended pattern languages.

©Copyright 2000 Elsevier Science