New Algorithm: Pyramid Matching Kernel

by JS

A kernel, briefly, is a function of two vectors that behaves like a dot product. You can think of it as sort of a generalized dot product. Kernels are very important in machine learning because they are a way of implicitly defining distance functions between objects.

An algorithm that needs to compare two objects can compute a bunch of features for each object (which form a vector), then use the kernel to compare the two feature vectors, and get something that “like” a distance between the objects. You can then determine things like class boundaries that reside in these “kernelized feature spaces.” As you can probably guess, things get abstract quickly when thinking about kernels, and new useful kernels are not discovered very often.

That’s one reason the pyramid matching kernel is so interesting.  Another is that it can deal with objects that have different numbers of computed features. This often happens in “bag-of-feature” approaches in machine learning. The pyramid matching kernel takes bags of features and computes a bunch of histograms of varying sizes over the feature sets. The kernel magic arises out of how these histograms are compared.

As an example consider three objects (e.g. images, audio, sensory input of some kind) each with different bags of 2-d features (click to enlarge):features

I generated feature bags one and two randomly. Feature bag three is a random subset of feature bag one, with each feature perturbed by a small random amount. This is a toy version of what feature bags for real classification tasks on rich inputs actually look like (the bags usually contain many more features, and each feature usually has a lot more than 2 dimensions), but the essentials are all here.

After projecting the features into histograms of many different bin sizes, the pyramid matching kernel computes the histogram intersections. The intuition here is that at small bin sizes, only feature bags whose features are often proximal will have similar looking histograms, and hence intersections that have high counts.

Bags one and two, generated uniformly at random, won’t have bins that share  count numbers, and this is precisely what we see:

f1f2

But even for small bin sizes, bags one and three, which are similar, should have more counts in common:f1f3There’s a lot more to pyramid kernel matching, which you can read about in CACM:

Efficiently Searching for Similar Images. K. Grauman. Invited article to appear in the Communications of the ACM, 2009.

Or in the original paper on the subject:

The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. K. Grauman and T. Darrell. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Beijing, China, October 2005.

The Python implementation (bare bones and probably buggy) that I used to generate the plots is available here.

  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Tumblr
  • Twitter