TCS-TR-A-06-16Date: Sat Jun 17 07:22:56 2006 Title: N-gram Analysis Based on Zero-suppressed BDDs Authors: Ryutaro Kurai, Shin-ichi Minato, and Thomas Zeugmann Contact:
Abstract. In present paper, we propose a new method of n-gram analysis using ZBDDs (Zero-suppressed BDDs). ZBDDs are known as a compact representation of combinatorial item sets. Here, we newly apply the ZBDD-based techniques for efficiently handling sets of sequences. Using the algebraic operations defined over ZBDDs, such as union, intersection, difference, etc., we can execute various processings and/or analyses for large-scale sequence data. We conducted experiments for generating n-gram statistical data for given real document files, and the obtained results show the potentiality of the ZBDD-based method for the sequence database analysis. ©Copyright 2006 Authors |