TCS-TR-A-10-46

Date: Sun Aug 8 19:17:26 2010

Title: Training Parse Trees for Efficient VF Coding

Authors: Takashi Uemura, Satoshi Yoshida, Takuya Kida, Tatsuya Asai, and Seishi Okamoto

Contact:

  • First name: Takuya
  • Last name: Kida
  • Address: Hokkaido University, Kita 14, Nishi 9, Kita-ku, Sapporo 060-0814, Japan
  • Email: kida@ist.hokudai.ac.jp

Abstract. We address the problem of improving variable-length-to-fixed-length codes (VF codes), which have favourable properties for fast decoding and compressed pattern matching but moderate compression ratios. Their compression ratios depend on the parse trees that they use as a dictionary. However, it is intractable to construct the optimal parse tree, and thus only heuristic approaches can work. We propose a method that trains a parse tree by scanning an input text repeatedly, and we show experimentally that it can improve the compression ratio of VF codes rapidly to the level of state-of-the-art compression methods.


©Copyright 2010 Authors