Efficient Learning of Semi-structured Data from Queries

Authors: Hiroki Arimura, Hiroshi Sakamoto and Setsuo Arikawa.

Source: Lecture Notes in Artificial Intelligence Vol. 2225, 2001, 315 - 331.

Abstract. This paper studies the polynomial-time learnability of the classes of ordered gapped tree patterns (OGT) and ordered gapped forests (OGF) under the into-matching semantics in the query learning model of Angluin. The class OGT is a model of semi-structured database query languages, and a generalization of both the class of ordered/unordered tree pattern languages and the class of non-erasing regular pattern languages. First, we present a polynomial time learning algorithm for µ-OGT, the subclass of OGT without repeated tree variables, using equivalence queries and membership queries. By extending this algorithm, we present polynomial time learning algorithms for the classes µ-OGF of forests without repeated variables and $OGT$ of trees with repeated variables using equivalence queries and subset queries. We also give representation-independent hardness results which indicate that both of equivalence and membership queries are necessary to learn µ-OGT.

©Copyright 2001 Springer