Chunlei Chen: Gaussian Mixture

EM of GMMs with GPU Acceleration (CUDA)

Sample Expectation Maximization; red Xs represent initial guessed means.

About:

This is a parallel implementation of the Expectation Maximization algorithm for Gaussian Mixture Models, designed to run on NVidia graphics cards supporting CUDA. On my machine*, it provides up to 170x performance increases versus a CPU reference version.
See the report for more information.
The interesting code is all in gpugaumixmod.h and gpugaumixmod_kernel.h. The reference CPU implementation is included in cpugaumixmod.h.
It can be integrated into any C program on a CUDA enabled system. Additionally, Matlab integration is provided in gmm.cu.

Expectation Maximization is a powerful method for converging to a local maximum. K-means clustering is a special case of expectation maximization of Gaussian clusters.

It was created originally as a term project for my Computational Statistics with Application to Bioinformatics class. See the report for more information.

E-mail me with any questions or comments!
*CUDA 2.1, GTX285, C2D E8400 @ 3GHZ, 4GB ram, Vista 64-bit

Downloads:

Project Report
Parallel EM/GMM Matlab/C++ Source (last update: 5/21/2009)
*bonus* Parallel IQAgent C++ Source (distribution sampler)

Implementation

The interesting code is all in gpugaumixmod.h and gpugaumixmod_kernel.h.
The reference CPU implementation is in gaumixmod.h, which is from Numerical Recipes 3.
It can be integrated into any C program on a CUDA-enabled system. Additionally,
Matlab Mex integration is provided in gmm.cu.

Compiling

You may have trouble compiling as-is, as the config files are set up to
run on my Windows Vista 64bit machine, but it's just a standard Cuda kernel
underneath so it should be portable. A precompiled Windows 64-bit version is
included.
See compile.m for the command I use to compile the CUDA/Mex files.
You'll need the CUDA drivers and toolkit from Nvidia:
http://www.nvidia.com/object/cuda_get.html You'll also want to install the MATLAB plug-in for CUDA: http://developer.nvidia.com/object/matlab_cuda.html">http://www.nvidia.com/object/cuda_get.html
You'll also want to install the MATLAB plug-in for CUDA:
http://developer.nvidia.com/object/matlab_cuda.html

Running

Once compiled, start off by running gmm_example in Matlab to see it in action.
See experiment1, experiment2, experiment3 for ready-to-run experiments

CUDA kernel performance

Updates:

Initial release May 6, 2009
Updated May 8, 2009:
-Fixed synchronization issues in kernel
-Cleaned up mex wrapper
-Cleaned up some potential memory issues
-Combined GpuGauMixmod and GauMixmod classes into single class
Updated May 10, 2009:
-Performance increases, doubling speed vs previous version.
Best case it performs 170x reference version speed
(16 dims, 16 clusters, 1000000 data points).
   *Unrolled some loop tails
   *Decomposed problem even more
   *Changed unnecessary double to float
   *Using fast exp and log functions now.
*Memcopy to GPU is somewhat lazier.
-Updated plotting and experiment code
-Uncombined GpuGauMixmod and GauMixmod classes again.
Updated May 21, 2009:
-Now handles simultaneous random restarts. Just make the
matrix passed to it 3-dimensional, with the 3rd dimension containing different
restarts.
*The step function will return a list of log likelihoods for the different restarts.
*The functions for querying response and mean/sigmas now take an optional second
   parameter for getting the data for the corresponding restart.

Chunlei Chen

Search This Blog

Thursday, March 24, 2011

Gaussian Mixture

Andrew's Insidious Plot