Date: Mon Mar 1 10:53:04 2010
Authors: Atsuyoshi Nakamura Hisashi Tosaka Mineichi Kudo
Abstract. We propose a novel frequent approximate pattern mining that suits estimation of occurrence regions. Given a string s, our mining enumerates its substrings that locally optimally match many substrings of s. We show an algorithm for this problem in which candidate patterns are generated without duplication using the suffix tree of s. This problem can be extended to the problem of enumerating approximate frequent subforests of a given ordered labeled tree T. Our mining was applied to the task of extraction of search result records from a web page returned by a search engine, and had good performance for benchmark data sets.
©Copyright 2010 Authors