A Probabilistic Identification Result

Author: Eric McCreath.

Source: Lecture Notes in Artificial Intelligence Vol. 1968, 2000, 131 - 140.

Abstract. The approach used to assess a learning algorithm should reflect the type of environment we place the algorithm within. Often learners are given examples that both contain noise and are governed by a particular distribution. Hence, probabilistic identification in the limit is an appropriate tool for assessing such learners. In this paper we introduce an exact notion of probabilistic identification in the limit based on Laird's thesis. The strategy presented incorporates a variety of learning situations including: noise free positive examples, noisy independently generated examples, and noise free with both positive and negative examples. This yields a useful technique for assessing the effectiveness of a learner when training data is governed by a distribution and is possibly noisy. An attempt has been made to give a preliminary theoretical evaluation of the Q-heuristic. To this end, we have shown that a learner using the Q-heuristic stochastically learns in the limit any finite class of concepts, even when noise is present in the training examples. This result is encouraging, because with enough data, there is the expectation that the learner will induce a correct hypothesis. The proof of this result is extended to show that a restricted infinite class of concepts can also be stochastically learnt in the limit. The restriction requires the hypothesis space to be g-sparse.

©Copyright 2000 Springer