|Stochastic Systems Group|
Doctoral Student, AI Lab, MIT
An increasing number of parameter estimation tasks involve the use of at least two data sources that may provide conflicting information about the parameters of interest. Examples include classification with labeled and unlabeled samples, prior selection for Bayesian estimation, and various data fusion tasks such as acoustic and language model fusion in speech recognition. Standard estimation algorithms used in this context assume implicitly or explicitly a fixed weighting of the data sources. We show that estimation can be very sensitive to the choice of such weighting, in the sense that small changes in source allocation can lead to a dramatic loss in performance. We demonstrate that such instability occurs at well-defined data-dependent source allocations, and we introduce an algorithm that locates the critical allocations and provides a stable estimate by continuously tracing local maxima of the objective from full weight on one data source to full weight on the other. This homotopy continuation algorithm has general theoretical guarantees and properties which make it applicable not only to estimation from multiple sources, but also to discriminative learning, and even to finding equilibria of competitive (minimax) estimation problems, and example of which we give in the context of DNA binding motif discovery. We also explore the link between homotopy continuation and standard algorithms such as EM, connection that reveals interesting properties of EM itself.
Joint work with Tommi Jaakkola.
Problems with this site should be emailed to email@example.com