![]()
![]()

data point likelihood log-likelihood -1.2 0.4293 -0.845567 -1.1 1.16699 0.154433 -1.1 1.16699 0.154433 -1.0 1.628675 0.487767 -1.0 1.628675 0.487767 -1.0 1.628675 0.487767 -0.9 1.16699 0.154433 -0.9 1.16699 0.154433 -0.8 0.4293 -0.845567 0.8 0.4293 -0.845567 0.9 1.16699 0.154433 0.9 1.16699 0.154433 1.0 1.628675 0.487767 1.0 1.628675 0.487767 1.0 1.628675 0.487767 1.1 1.16699 0.154433 1.1 1.16699 0.154433 1.2 0.4293 -0.845567This gives the sum of the log-likelihoods as 0.78 which tells us that the model is pretty bad. I have taken natural logarithms here, if I convert to bits instead of nats, I get 1.1 bits.
The bit cost is therefore a total of
(18)(6) -1.1 + (2)(6)(3)
bits, that is 142.9.
The corresponding calculation for one gaussian
has
and all the likelihoods
are around 0.22, so the sum of the log likelihoods
is about -27.25 nats. This works out at around
a 39.3 bit disadvantage. This should tell you
that a bad model is worse than the uniform model,
which is to say worse than sending the data. The
bit cost of using a single gaussian centred on
the
origin therefore works out at
(18) (6) + 39.3 + (1)(6)(3)
that is, 165.3 bits. This is bigger, so it is better to use two gaussians than one to model the data, but it is even better to just send the data at a cost of 108 bits. You should be prepared to believe that more data that fitted the model more convincingly would give a genuine saving. Even I think I could believe that, although, of course, I can't be sure I could.Better check my arithmetic, it has been known to go wrong in the past.