TCS-TR-A-05-6

Date: Wed Jul 20 15:32:08 2005

Title: A Polynomial Space Polynomial Delay Algorithm for Enumeration of Maximal Motifs in a Sequence

Authors: Hiroki Arimura and Takeaki Uno

Contact:

  • First name: Hiroki
  • Last name: Arimura
  • Address: Division of Computer Science Graduate School of Information Science and Technology Hokkaido University Kita 14-jo, Nishi 9-chome Sapporo 060-0814, JAPAN
  • Email: arim@ist.hokudai.ac.jp

Abstract. In this paper, we consider the problem of finding all maximal motifs in an input string for the class of repeated motifs with wild cards. A maximal motif is such a representative motifs that is not properly contained in larger motifs with the same location lists. The enumeration problem for maximal motifs with wild cards has been introduced in (Parida et al., SODA'00, CPM'01), and has been studied in (Parida et al., CPM'01), (Pisanti et al.,MFCS'03) and (Pelfrene et al., CPM'03). However, its output-polynomial time computability, in particular, with polynomial space and delay is still open. The main result of this paper is a polynomial space polynomial delay algorithm for the maximal motif enumeration problem for the repeated motifs with wild cards. This algorithm enumerates all maximal motifs in an input string of length n with O(n^3) time per motif with O(n^2) space and O(n^3) delay by depth-first search. The key of the algorithm is a tree-shaped search route over all maximal motifs based on a technique called prefix-preserving closure extension, which enable us to enumerate all maximal motifs without storing all discovered motifs. We also show exponential lower bound and succinctness results on the number of maximal motifs, which indicate the limit of a straightforward approach.


©Copyright 2005 Authors