The VC-Dimension of Subclasses of Pattern Languages

Authors: Andrew Mitchell, Tobias Scheffer, Arun Sharma, and Frank Stephan.

Source: Lecture Notes in Artificial Intelligence Vol. 1720, 1999, 93 - 105.

Abstract. This paper derives the Vapnik Chervonenkis dimension of several natural subclasses of pattern languages. For classes with unbounded VC-dimension, an attempt is made to quantify the ``rate of growth'' of VC-dimension for these classes. This is achieved by computing, for each n, size of the ``smallest'' witness set of n elements that is shattered by the class. The paper considers both erasing (empty substitutions allowed) and nonerasing (empty substitutions not allowed) pattern languages. For erasing pattern languages, optimal bounds for this size -- within polynomial order -- are derived for the case of 1 variable occurrence and unary alphabet, for the case where the number of variable occurrences is bounded by a constant, and the general case of all pattern languages. The extent to which these results hold for nonerasing pattern languages is also investigated. Some results that shed light on efficient learning of subclasses of pattern languages are also given.

©Copyright 1999 Springer-Verlag