Stochastic Systems Group  

Highdimensional statistical inference: Practical and Informationtheoretic limits on sparsity recovery
Martin J. Wainwright
University of California, Berkeley
Classical asymptotics in statistics addresses the large sample size limit (n), with the model dimension (p) staying fixed. In many inference problems arising in large systems, however, a more appropriate analysis allows p = p(n) to grow as a function of n, which leads to highdimensional problems. One instance of such a highdimensional inference problem is that of recovering the sparsity pattern of an unknown signal. This sparsity recovery problem arises in various areas of engineering and statistics, including signal denoising, constructive approximation, compressive sensing, and model selection, and has been studied intensively over the past decade.
Concretely, suppose that one takes n noisy observations of an unknown signal in p dimensions with at most s nonzero entries. Of interest is the minimum number of observations n that are required, as a function of the model dimension p and sparsity index s, to correctly estimate the support of the signal. For a broad class of random Gaussian measurement ensembles, we provide asymptotically sharp upper and lower bounds on the performance of a computationally efficient method, based on l_{1}constrained quadratic programming (known as the Lasso in the statistics literature). In addition, we provide informationtheoretic upper and lower bounds on the performance of any method, regardless of its computational efficiency, and compare them to the practical limits. We discuss connections to other work, and some open problems in this rapidlygrowing area. Timepermitting, we also discuss related work on highdimensional model selection in discrete Markov random fields, where qualitatively similar results are obtained (the latter joint work with Pradeep Ravikumar and John Lafferty).
Problems with this site should be emailed to jonesb@mit.edu