Risk-Sensitive Online Learning

Authors: Eyal Even-Dar, Michael Kearns, and Jennifer Wortman

Source: Algorithmic Learning Theory, 17th International Conference, ALT 2006, Barcelona, October 2006, Proceedings, (José L. Balcázar, Phil Long and Frank Stephan, Eds.), Lecture Notes in Artificial Intelligence 4264, pp. 199 - 213, Springer 2006.

Abstract. We consider the problem of online learning in settings in which we want to compete not simply with the rewards of the best expert or stock, but with the best trade-off between rewards and risk. Motivated by finance applications, we consider two common measures balancing returns and risk: the Sharpe ratio [9] and the mean-variance criterion of Markowitz [8]. We first provide negative results establishing the impossibility of no-regret algorithms under these measures, thus providing a stark contrast with the returns-only setting. We then show that the recent algorithm of Cesa-Bianchi et al. [5] achieves nontrivial performance under a modified bicriteria risk-return measure, and give a modified best expert algorithm that achieves no regret for a “localized” version of the mean-variance criterion. We perform experimental comparisons of traditional online algorithms and the new risk-sensitive algorithms on a recent six-year S&P 500 data set and find that the modified best expert algorithm outperforms the traditional with respect to Sharpe ratio, MV, and accumulated wealth. To our knowledge this paper initiates the investigation of explicit risk considerations in the standard models of worst-case online learning.

©Copyright 2006, Springer