Convex risk minimization and conditional probability estimation

Matus Telgarsky, Miroslav Dudík, Robert Schapire

Research output: Contribution to journalConference articlepeer-review

Abstract

This paper proves, in very general settings, that convex risk minimization is a procedure to select a unique conditional probability model determined by the classification problem. Unlike most previous work, we give results that are general enough to include cases in which no minimum exists, as occurs typically, for instance, with standard boosting algorithms. Concretely, we first show that any sequence of predictors minimizing convex risk over the source distribution will converge to this unique model when the class of predictors is linear (but potentially of infinite dimension). Secondly, we show the same result holds for empirical risk minimization whenever this class of predictors is finite dimensional, where the essential technical contribution is a norm-free generalization bound.

Original languageAmerican English
JournalJournal of Machine Learning Research
Volume40
Issue number2015
StatePublished - 2015
Externally publishedYes
Event28th Conference on Learning Theory, COLT 2015 - Paris, France
Duration: Jul 2 2015Jul 6 2015

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Control and Systems Engineering
  • Statistics and Probability

Keywords

  • Classification
  • Conditional Probability Estimation
  • Consistency
  • Convex Duality
  • Maximum Entropy
  • Orlicz Spaces

Cite this