Date: Thu Feb 17 18:41:45 2005
Authors: H. Hasegawa, M. Kudo and A. Nakamura
Abstract. We propose a new method of extracting texts related to a given keyword from Web pages collected by a search engine. By combining structural pattern matching and text classification, texts related to a given keyword such as reputations of a given restaurant can be extracted automatically from Web pages in unfixed sites, which is impossible by conventional wrappers. According to our cross validation results on extracting reputations of a given Ramen shop from Web pages collected by a search engine, our method achieved 79.3% precision and 56.6% recall by allowing acceptable errors.
©Copyright 2005 Authors