The next algorithm in my continuing series of short, hackable implementations of common machine learning algorithms is fitting a Gaussian mixture model through expectation maximization.
This example follows section 9.2 in Bishop’s PRML. You can think of this kind of EM as “soft” clustering. We assume that the data has clusters, and that the cluster that any particular data point belongs to is missing information. It is precisely this kind of hidden information that EM attempts to recover.
You can think of the algorithm as guessing the hidden cluster for each point, then assuming that guess is correct, figuring out what the remaining distribution parameters should be. Then guess again, and recompute. Repeating this process often yields a useful estimate of the hidden parameters, as well explaining the visible data.
No Comments »
It’s been awhile since I posted a new algorithm. I’ve been reading quite a bit on Monte Carlo methods, and in particular Markov Chains. I came across some pseudo code for what the authors of Monte Carlo Statistical Methods call a 2d slice sampler.
Check it out!
Now I suppose the primary difficulty in defining a slice sampler is in finding good closed forms for the uniform sample bounds.
Here’s the result of running the code:

The sample distribution adheres fairly closely to the desired distribution.
No Comments »
Here’s a figure that keeps mysteriously appearing in presentations.

It is a cartoon representation of model evidence (from Bishop’s Pattern Recognition and Machine Learning), but it seems to often be mistaken for Bayesian model comparison generally.
No Comments »
I’ve become increasingly convinced that I need to understand both applied and theoretical Bayesian inference. Since the department offers no courses on the subject (Engineering might, but that will have to wait for another semester), I’m collecting library books that deal (sometimes tangentially) with the subject.
The library has a lot of books that have one or more of the words Bayesian, statistics, inference or probability in the title.
I picked four at random to start:
1. Basic Principles and Applications of Probability Theory
2. Kendall’s Advance Theory of Statistics Volume 2B Bayesian Inference
3. Baseyian Core: A Practicle Approach to Computational Bayesian Statistics
4. Foundations of Modern Probability
We’ll see how it goes.
No Comments »
I’m clearing out my draft posts, without actually trying to flesh them out. Anyway, here’s some questions I’m thinking about. As you may be able to infer, I’m trying to teach myself statistics.
Natural conjugate priors – prior has the same functional form as the likelihood. Is there a category theoretical explanation of “natural” in this context? Things I’m trying to understand: exponential families, sufficient statistic, natural conjugate prior.Some websites:
OCW 1
OCW 2
OCW 3
OCW 4
OCW 5
OCW 6
No Comments »