|Stochastic Systems Group|
Dr. John W. Fisher III
AI Lab, MIT
Independent components analysis and blind source separation of linear mixtures are related problems that have seen renewed interest in the machine learning community stemming from work by Jutten and Herrault ('91) and Comon ('94). Subsequently Bell and Sejnowski ('95) presented an information theoretic method to the source separation spawning a series of related approaches. It turns out that Bell and Sejnowski, while formulated from an information theoretic perspective, is formally equivalent to a maximum likelihood approach to source separation (i.e. strong modeling assumptions are implicit in the method). I will spend the first part of the talk showing the link between the two and how many ICA algorithms can be viewed as approximations to an equivalent ML estimation problem. This perspective is useful as it exposes underlying assumptions and explains failure modes in many algorithms.
Partially motivated by this link I will discuss a new algorithm for the ICA problem based on a variant of efficient entropy estimators from the statistics literature. The key ideas guiding our approach are:
(1) Since ICA is, by definition, about maximizing statistical independence, we attempt to directly optimize a measure of statistical independence, rather than a surrogate for this measure.
(2) We avoid explicit estimation of probability densities as an intermediate step. Indeed, given the formulation of the objective function, density estimation (even implicitly) is entirely unnecessary.
In extensive testing, the method outperforms all current methods of which we are aware, including the recently published Kernel-ICA method. Like many previous methods, this algorithm directly minimizes the measure of departure from independence according to the estimated Kullback-Leibler divergence between the joint distribution and the product of the marginal distributions. The entropy estimator used to achieve this minimization is consistent and exhibits rapid convergence. It is also computationally simple, requiring only O(N log N) time to compute. The estimator is highly robust to outliers, leading to a method which uniformly outperforms the best current methods in robustness tests.
(joint work with Dr. Erik Miller, UC Berkeley)
Problems with this site should be emailed to firstname.lastname@example.org