Incremental Concept Learning

Incremental Concept Learning for Bounded Data Mining

Authors: John Case, Sanjay Jain, Steffen Lange and Thomas Zeugmann

Source: Information & Computation Vol. 152, No. 1, 1999, 74 - 110.

Abstract. Important refinements of concept learning in the limit from positive data considerably restricting the accessibility of input data are studied. Let c be any concept; every infinite sequence of elements exhausting c is called positive presentation of c. In all learning models considered the learning machine computes a sequence of hypotheses about the target concept from a positive presentation of it. With iterative learning, the learning machine, in making a conjecture, has access to its previous conjecture and the latest data item coming in. In k-bounded example-memory inference (k is a priori fixed) the learner is allowed to access, in making a conjecture, its previous hypothesis, its memory of up to k data items its has already seen, and the next element coming in. In the case of k-feedback identification, the learning machine, in making a conjecture, has access to its previous conjecture, the latest data item coming in, and, on the basis of this information, it can compute k items and query the database of previous data to find out, for each of the k items, whether or not it is in the database (k is again a priori fixed). In all cases, the sequence of conjectures has to converge to a hypothesis correctly describing the target concept.

Our results are manyfold. An infinite hierarchy of more and more powerful feedback learners in dependence on the number k of queries allowed to be asked is established. However, the hierarchy collapses to 1-feedback inference if only indexed families of infinite concepts are considered, and moreover, its learning power is then equal to learning in the limit. But it remains infinite for concept classes of only infinite r.e. concepts. Both k-feedback inference and k-bounded example-memory identification are more powerful than iterative learning but incomparable to one another. Furthermore, there are cases where redundancy in the hypothesis space is shown to be a resource increasing the learning power of iterative learners. Finally, the union of at most k pattern languages is shown to be iteratively inferable.