# TCS-TR-B-08-5

Date: Tue Jul 15 20:38:10 2008

Authors: Tomoya Saito

Contact:

• First name: Takuya
• Last name: Kida
• Address: Division of Computer Science, Hokkaido University, North 14, West 9, Sapporo 060-0814, Japan
• Email: kida@ist.hokudai.ac.jp

Abstract. In this paper, we study a complex time-series pattern matching problem over a multi-dimension continuous data stream, that is, multiple streams come into a matching machine in parallel. For each data stream, a pattern is given as a sequence of predicates, which specify the classes of the elements of the stream. The pattern matching problem over such a multi-dimension data stream, is to find all occurrences where all predicates in the patterns are satisfied. In order to do faster pattern matching, we propose a flexible and extensible scheme which combines two techniques, static compilation of geometrical constraint queries and bit-parallel pattern matching that simulates NFAs for the pattern matching efficiently by a few logical bit operations. Our main contributions are following three algorithms: an algorithm that runs in $O(\frac{1}{w}mn)$ time to solve the time-series pattern matching problem for 1-dimension data streams, an algorithm that runs in $O(\frac{1}{w}mn\log m)$ time to solve the problem for $d$-dimension real number data streams without correlations for each other, and an algorithm that runs in $O(\frac{1}{w}mn\log v)$ time to solve the problem for $2$-dimension real number data streams with correlations, where $m,v,n,w$ are the pattern length, the pattern size, the stream length, and the machine register length. We also showed empirically that our algorithms runs $1.5\sim 5$ times faster than the previous methods.