## Learning Minimum Volume Sets

** Clayton D. Scott, Robert D. Nowak**; 7(24):665−704, 2006.

### Abstract

Given a probability measure *P* and a reference measure
*μ*, one is often interested in the minimum *μ*-measure set
with *P*-measure at least *α*. Minimum volume sets of this
type summarize the regions of greatest probability mass of *P*,
and are useful for detecting anomalies and constructing confidence
regions. This paper addresses the problem of estimating minimum
volume sets based on independent samples distributed according to
*P*. Other than these samples, no other information is available
regarding *P*, but the reference measure *μ* is assumed to be
known. We introduce rules for estimating minimum volume sets that
parallel the empirical risk minimization and structural risk
minimization principles in classification. As in classification, we
show that the performances of our estimators are controlled by the
rate of uniform convergence of empirical to true probabilities over
the class from which the estimator is drawn. Thus we obtain finite
sample size performance bounds in terms of VC dimension and related
quantities. We also demonstrate strong universal consistency, an
oracle inequality, and rates of convergence. The proposed estimators
are illustrated with histogram and decision tree set estimation
rules.

© JMLR 2006. (edit, beta) |