In the last chapter I sketched the theory of Rissanen on Stochastic Complexity, as described in his book Rissanen(1989) and papers (see the bibliography of the last chapter). As explained above, it offers a philosophically quite different approach to finding the `best' model to account for the data, and an approach which avoids some of the objections
I have pointed out to the classical theory. In particular it appears to solve elegantly the issue of how many gaussians one ought to have in a gaussian mixture model for a particular data set. (The answer may be `none', if the data set is small enough!)
As explained earlier, the general question as to order of a model occurs in other settings also. At the simple level of fitting a polynomial to a set of data points, the question as to what degree of polynomial is appropriate arises. At one extreme one can have the degree so high that every point is fitted, at the other we can assume it is constant. The same problem arises in the case of Neural Nets, as we shall see in a later chapter.
The gaussian mixture modelling problem, how many gaussians should we use, makes the problem quite stark. There will be an increase in the log-likelihood of the data with respect to the model whenever we increase the number of gaussians, until the degenerate case of non-invertible covariance matrices makes it infinite. But there is a penalty to be paid in terms of the number of parameters needed to specify the increasing number of gaussians. It is sensible to try to penalise the model in some way by offsetting the increase in log-likelihood with a corresponding increase in the number of parameters, and seeking to minimise the combination. But then the question arises, what should be the rate of exchange between log-likelihood and the parameter count? Rissanen gives us a natural and plausible rate of exchange. In this section I shall give some examples of computations on fitting gaussians in one dimension to data; see the discussion in the last chapter for the framework of thought which justifies the procedure.