|Stochastic Systems Group|
High-dimensional statistical inference: Practical and Information-theoretic limits on sparsity recovery
Martin J. Wainwright
University of California, Berkeley
Classical asymptotics in statistics addresses the large sample size limit (n), with the model dimension (p) staying fixed. In many inference problems arising in large systems, however, a more appropriate analysis allows p = p(n) to grow as a function of n, which leads to high-dimensional problems. One instance of such a high-dimensional inference problem is that of recovering the sparsity pattern of an unknown signal. This sparsity recovery problem arises in various areas of engineering and statistics, including signal denoising, constructive approximation, compressive sensing, and model selection, and has been studied intensively over the past decade.
Concretely, suppose that one takes n noisy observations of an unknown signal in p dimensions with at most s non-zero entries. Of interest is the minimum number of observations n that are required, as a function of the model dimension p and sparsity index s, to correctly estimate the support of the signal. For a broad class of random Gaussian measurement ensembles, we provide asymptotically sharp upper and lower bounds on the performance of a computationally efficient method, based on l1-constrained quadratic programming (known as the Lasso in the statistics literature). In addition, we provide information-theoretic upper and lower bounds on the performance of any method, regardless of its computational efficiency, and compare them to the practical limits. We discuss connections to other work, and some open problems in this rapidly-growing area. Time-permitting, we also discuss related work on high-dimensional model selection in discrete Markov random fields, where qualitatively similar results are obtained (the latter joint work with Pradeep Ravikumar and John Lafferty).
Problems with this site should be emailed to email@example.com