Nonparametric Belief Propagation


Nonparametric Belief Propagation
(NBP) is an inference algorithm for graphical models containing continuous, non-Gaussian random variables. NBP extends the popular class of particle filtering algorithms, which assume variables are related by a Markov chain, to general graphs. NBP's computations are based on efficient sampling algorithms which avoid the need to discretize high-dimensional spaces. Currently, NBP has been applied to several computer vision problems, but it also has many potential applications in other fields.

Algorithm

Belief Propagation

The Belief Propagation (BP) algorithm solves graphical inference problems via a series of local message-passing operations. For example, to compute the outgoing (red) message at left, the central node must combine all incoming messages (blue) with its local observation. In tree structured graphs, BP is exact, and messages can be interpreted as sufficient statistics. In graphs with cycles, BP is approximate, but has reasonable theoretical justifications and produces excellent empirical results in many applications. BP may be applied to graphs with discrete (messages are vectors) or Gaussian (messages are means and covariances) variables.


Products of Gaussian Mixtures

In graphical models with continuous, non-Gaussian variables, there is usually no tractable analytic form for the messages underlying BP. NBP addresses this problem by representing each message nonparametrically as a Gaussian mixture. If performed exactly, each message update would then produce a new product mixture with an exponentially large number of components (see diagram at right). NBP approximates this product mixture using an efficient Gibbs sampling procedure, allowing a natural tradeoff between statistical accuracy and computational efficiency.

 

Multiscale Methods for Efficient Sampling

In most NBP applications, the primary computational burden is the sampling underlying each message update. We have developed a pair of algorithms which use multiscale, KD-tree representations to sample from Gaussian mixture products far more efficiently than conventional methods. These algorithms can be tuned to sample to a user-specified degree of accuracy, and exhibit excellent empirical performance.


Publications

Collaborators

 
Erik Sudderth
esuddert mit.edu
 
Alexander Ihler
ihler mit.edu
 
William Freeman
billf ai.mit.edu
 
Alan Willsky
willsky mit.edu

 

 

 

Code

 
Kernel Density Estimation
(Matlab/C++)
History

 


Examples

Hand Tracking

Accurate visual detection and tracking of 3D articulated objects is a challenging problem with applications in human-computer interfaces, motion capture, and scene understanding. We have used NBP to develop a tracking algorithm which exploits the statistical structure of a kinematic hand model to improve robustness and efficiency.

We describe the hand using 16 rigid bodies constrained by revolute joints (top left). The tracking algorithm uses NBP to search for 3D configurations whose projections match image-based edge and color cues (top right). Given a single frame, NBP propagates information about likely configurations of each finger to infer overall hand position. Our graphical model of hand dynamics (lower right) allows NBP to track hand motion through image sequences. Two result videos (lower left) show projections of the marginal position estimates produced by our tracker. Note that the "Grasping" sequences shows some robustness to self-occlusion.

For further details, additional results, and references, please see our technical report presented at the 2004 CVPR Workshop on Generative Model Based Vision.

Rotation
Video
Sequence
Grasping
Video
Sequence
  Occlusion Video Sequence          
 

 

Facial Components

Component-based visual appearance models provide the basis for many of the best face detection and recognition algorithms. We have extended these approaches to model not only the appearance of individual facial features, but also the relationships between the location and appearance of different features.

Using manually aligned training images (top left), we built a nonparametric graphical prior model (top right) for the appearance and location of five different facial features. The model includes an outlier process to allow detection of occluded features. The results of using NBP to fit this model to two images are shown to the right. In both cases, feature positions are correctly identified (middle column). By observing the squinting eyes of the subject in the lower photo, and using relationships learned from the training data, NBP correctly infers (right column) that the hidden mouth is most likely smiling

For further details, additional results, and references, please see our CVPR 2003 paper.

Training data (including image at right) courtesy of AR Face Database.

 

Stereo Vision

Graphical models are often used to regularize the otherwise ill-posed problem of estimating the depth of three-dimensional objects from a pair of input images. If the range of possible inverse depths, or disparities, is discretized, discrete BP may be used to produce the depth estimates at lower left. The results for NBP (lower right) verify that its sampling updates closely approximate the true BP messages. Moreover, NBP may also be applied to higher dimensional problems (often arising from more sophisticated prior models) for which discretization is intractable.

For further details, additional results, and references, please see our LIDS technical report.

 

 

Stereo test data courtesy of Middlebury Stereo Database.

 

Left Input Image

True Disparity Map

Discrete BP Disparity Map

NBP Disparity Map

Last Updated