Finding Total and Partial Orders from Data for Seriation
(invited lecture for ALT & DS 2008)

Author: Heikki Mannila

Affiliations: HIIT, Helsinki University of Technology and Department of Computer Science, University of Helsinki, Helsinki, Finland

Abstract. Ordering and ranking items of different types (observations, web pages, etc.) are important tasks in various applications, such as query processing and scientific data mining. We consider different problems of inferring total or partial orders from data, with special emphasis on applications arising from paleontology.

Seriation can be viewed as the task of ordering rows of a 0-1 matrix so that certain conditions hold. We review different approaches to this task, including spectral ordering and MCMC techniques.

A total order for the items can sometimes be misleading, since there are groups of items that have practically equal ranks. Bucket orders, i.e., total orders with ties can be used to capture the essential order information without overfitting the data: they form a useful concept class between total orders and arbitrary partial orders. We review different methods for finding partial orders and bucket orders; some of the methods have a provable approximation guarantee, and scale well to large datasets.

Joint work with Antti Ukkonen, Aris Gionis, Mikael Fortelius, and Kai Puolamäki.
©Copyright 2008 Author