The Statistical Performance of Collaborative Inference
Gérard Biau, Kevin Bleakley, Benoît Cadre; 17(62):1−29, 2016.
Abstract
The statistical analysis of massive and complex data sets will require the development of algorithms that depend on distributed computing and collaborative inference. Inspired by this, we propose a collaborative framework that aims to estimate the unknown mean θ of a random variable X. In the model we present, a certain number of calculation units, distributed across a communication network represented by a graph, participate in the estimation of θ by sequentially receiving independent data from X while exchanging messages via a stochastic matrix A defined over the graph. We give precise conditions on the matrix A under which the statistical precision of the individual units is comparable to that of a (gold standard) virtual centralized estimate, even though each unit does not have access to all of the data. We show in particular the fundamental role played by both the non-trivial eigenvalues of A and the Ramanujan class of expander graphs, which provide remarkable performance for moderate algorithmic cost.
© JMLR 2016. (edit, beta) |