On the Power of Incremental Learning

On the Power of Incremental Learning

Authors: Steffen Lange and Gunter Grieser
Email: lange@dfki.de
Source: Theoretical Computer Science Vol. 288, Issue 2, 17 September 2002, pp. 277 - 307.
Abstract. This paper provides a systematic study of incremental learning from noise-free and from noisy data. As usual, we distinguish between learning from positive data and learning from positive and negative data, synonymously called learning from text and learning from informant. Our study relies on the notion of noisy data introduced by Stephan.

The basic scenario, named iterative learning, is as follows. In every learning stage, an algorithmic learner takes as input one element of an information sequence for some target concept and its previously made hypothesis and outputs a new hypothesis. The sequence of hypotheses has to converge to a hypothesis describing the target concept correctly.

We study the following refinements of this basic scenario. Bounded example-memory inference generalizes iterative inference by allowing an iterative learner to additionally store an a priori bounded number of carefully chosen data elements, while feedback learning generalizes it by allowing the iterative learner to additionally ask whether or not a particular data element did already appear in the input data seen so far.

For the case of learning from noise-free data, we show that, when both positive and negative data are available, restrictions on the accessibility of the input data do not limit the learning capabilities if and only if the relevant iterative learners are allowed to query the history of the learning process or to store at least one carefully selected data element. This insight nicely contrasts the fact that, in case only positive data are available, restrictions on the accessibility of the input data seriously affect the learning capabilities of all versions of incremental learners.

For the case of learning from noisy data, we present characterizations of all kinds of incremental learning in terms being independent from learning theory. The relevant conditions are purely structural ones. Surprisingly, when learning from noisy text and noisy informant is concerned, even iterative learners are exactly as powerful as unconstrained learning devices.