The chapter started with a review of the preceding chapters and a promise to discuss the standard statistical methods for classifying points. The first section however comprised a short homily on the merits of investigating the data by inspecting the numbers. Most of us who have worked with data collected by other people have cautionary tales of its unreliability and the general fecklessness of other folk. Let us make sure nobody speaks so of us. Let us, in fact, scrutinise all data in as many ways as possible, and approach it on the hypothesis that it has been collected by politicians. Let us in particular project the data down from dimension umpteen onto the computer screen, and inspect it as if it might be suffering from a social disease.
The next section was where I came good on the
much more practical matter of how you did the
sums. I showed how you could compute the ML Gausian
for a set of data points in
. If the data
set doesn't look as if one category of data point
is usefully describable by a single gaussian,
I indicated that the EM algorithm can be used,
after suitable initialisation, to get the ML
gaussian mixture model. The idea is to take each
cluster of points in one category and model the
cluster as closely as possible by some number
of gaussians.
The following section showed what to do when two clusters had been modelled by competing pdfs. It derived the Bayes Optimal Classifier rule. It also threw in, for good measure, a quick and dirty rationalisation for the use of neighbourhood counting methods, a discussion on the Bayesian modelling of data by choosing a prior pdf and having the data convert us to a posterior pdf.
Finally, I described how the Rissanen Stochastic complexity/ Minimum Description Length approach allows one to deal with the vexing question of the order of a model. I took some 18 points judiciously arranged in the interval [-2,2] and showed how by considering the overhead of specifying the model, it was possible to justify using two gaussians instead of one in a formal manner. I described the alternative AIC rather briefly but gave you a rule of thumb which can be applied. And really finally, I wrote this summary, the exercises and the bibliography.