Real-Valued Multiple-Instance Learning with Queries

Authors: Daniel R. Dooly, Sally A. Goldman and Stephen S. Kwek .

Source: Lecture Notes in Artificial Intelligence Vol. 2225, 2001, 167 - 180.

Abstract. The multiple-instance model was motivated by the drug activity prediction problem where each example is a possible configuration for a molecule and each bag contains all likely configurations for the molecule. While there has been a significant amount of theoretical and empirical research directed towards this problem, most research performed under the multiple-instance model is for concept learning. However, binding affinity between molecules and receptors is quantitative and hence a real-valued classification is preferable.

In this paper we initiate a theoretical study of real-valued multiple instance learning. We prove that the problem of finding a target point consistent with a set of labeled multiple-instance examples (or bags) is NP-complete. We also prove that the problem of learning from real-valued multiple-instance examples is as hard as learning DNF. Another contribution of our work is in defining and studying a multiple-instance membership query (MI-MQ). We give a positive result on exactly learning the target point for a multiple-instance problem in which the learner is provided with a MI-MQ oracle and a single adversarially selected bag.

©Copyright 2001 Springer