Next: Bibliography
Up: Decisions: Statistical methods
Previous: Summary of Chapter
- 1.
- Write a program to generate some random
dots in a more or less gausian distribution on
the real
line. Now modify it so that it generates a two
dimensional approximately gaussian cluster so
that if I
specify the mean and covariance matrix I get a
suitable point set. There are good and efficient
ways of
doing this, but yours can be quick and dirty.
Next compute the mean and covariance of the
data so
generated. Does it bear a passing resemblance
to what you started out with? Can you account
for any
discrepancies? Draw the one standard deviation
ellipse for your data.
- 2.
- Use the above program to generate a `gaussian
hand' as in Fig. 4.1. Try to use the EM
algorithm with random initialisation to find a
maximum likelihood fit for six gaussians. What
conclusions do you draw about the algorithm?
- 3.
- Get someone to use your program to generate
a mixture of gaussians and then try to fit the
mixture without knowing how many terms in the
mixture. Use the AIC and the Rissanen Stochastic
Complexity measures to find the optimum number
of model elements. Try this with varying numbers
of
points and varying numbers of mixture terms in
various dimensions. Do the results surprise
you?
Depress you?
- 4.
- Photocopy an image of a real hand by putting
your palm on the photocopier, cut it out, paint
it
black and scan it into a tif file. Use the EM
algorithm to represent the pixel array as a gaussian
mixture model. Can you find an initialisation
process which gives reasonable results regardless
of the orientation or position of the hand in
the image? Are the results worse than for the
gaussian
hand?
- 5.
- The six quadratic forms specifying the components
of a hand comprise six points in
.Compute the centre and covariance matrix for these
points. This gives 5 numbers for the centre and
a
further 15 numbers for the covariance matrix.
This makes a hand into a point in
.
Repeat
with a new handprint in the same location and
the same orientation, and with the same handedness,
i.e. stick to right hand prints, giving a set
of points in
. Now fit a gaussian to these points.
Repeat by turning the paper hand over so as to
reverse
its `handedness'. Obtain a similar cluster for
these mirror hands. Now test to see if you can
classify
left hands and distinguish them from right hands
by looking to see which cluster they fall closest
to.
How badly do things go wrong if the fingers are
sometimes together? (With judicious and ingenious
initialisation, it is possible to get consistent
fits and reliable discrimination.)
- 6.
- Repeat with paper cut-outs of aeroplane
silhouettes, and see if you can distinguish between
a
fighter and a passenger jet. Make them about the
same size, so it isn't too easy.
- 7.
- Try gaussian mixture modelling an image
of some Chinese characters. Can you use the same
trick of describing the gaussians by computing
gaussians for them (UpWriting) in order to tell
different characters apart?
- 8.
- Compute Zernike moments for some characters
of printed English text and collect examples
of the
same character. Examine the clustering for variants
of the characters by projecting down onto the
computer screen (the programs with the book will
save you some coding if you have a workstation).
Compare the character `9' with a comma. Amend
the Zernike moments to give a size measure.
- 9.
- Construct a program which does printed Optical
Character Recognition. You will find problems
with
characters joined together and characters which
are disconnected; you will find limited font
independence a problem. This is a lot of work
but very educational, and later exercises
will allow you to extend the scope of it considerably.
It should be possible to get moderately good recogition
results for clean text correctly oriented by several
of the methods outlined in the preceding chapters.
Next: Bibliography
Up: Decisions: Statistical methods
Previous: Summary of Chapter
Mike Alder
9/19/1997