Stochastic Systems Group  

Professor Al Hero
Electrical Engineering and Computer Science,
University of Michigan
In this talk I will describe several applications of geometric probability methods for high dimensional data analysis problems. Frequently, populations of measurements of objects such as faces, genes expression profiles, or internet data traces, lie in lower dimensional nonlinear surfaces of their high dimensional imbedding spaces. Knowing the intrinsic dimension and relative entropy of these manifolds is important for discovering structure, classifying differences, or performing dimensionality reduction (compression). We first treat estimation of intrinsic dimension and intrinsic entropy of high dimensional data sets supported on Reimann manifolds and varieties. The estimators are provably consistent and are constructed in O(n log n) time from minimal spanning graphs, e.g. such as the MST or kNNG, over the entries in the dataset. Then we will show how minimal graphs can be used to estimate several information divergence measures arising in high dimensional pattern matching and coregistration of a sequence of images.
Problems with this site should be emailed to jonesb@mit.edu