Ordered term tree languages which are polynomial time inductively inferable from positive dataAuthors: Yusuke Suzuki, Takayoshi Shoudai, Tomoyuki Uchida and Tetsuhiro Miyahara
Source:
Theoretical Computer Science Vol. 350, Issue 1,
January 2006, pp. 6390
Abstract. In the fields of data mining and knowledge discovery, many semistructured data such as HTML/XML files are represented by rooted trees t such that all children of each internal vertex of t are ordered and t has edge labels. In order to represent structural features common to such semistructured data, we propose a linear ordered term tree, which is a rooted tree pattern consisting of ordered tree structures and internal structured variables with distinct variable labels. For a set of edge labels Λ let OTT_{Λ} be the set of all linear ordered term trees. For a linear ordered term tree t in OTT_{Λ}, the term tree language of t, denoted by L_{Λ}(t), is the set of all ordered trees obtained from t by substituting arbitrary ordered trees for all variables in t. Given a set of ordered trees S, the minimal language problem for OTT_{Λ} = {L_{Λ}(t)  t ∈ OTT_{Λ}} is to find a linear ordered term tree t in OTT_{Λ} such that L_{Λ}(t) is minimal among all term tree languages which contain all ordered trees in S. We show that the class OTT_{Λ} is polynomial time inductively inferable from positive data, by giving a polynomial time algorithm for solving the minimal language problem for OTT_{Λ}.

©Copyright 2005 Elsevier Science B.V.