next up previous contents
Next: Bibliography Up: Decisions: Statistical methods Previous: Summary of Chapter

Exercises

1.
Write a program to generate some random dots in a more or less gausian distribution on the real line. Now modify it so that it generates a two dimensional approximately gaussian cluster so that if I specify the mean and covariance matrix I get a suitable point set. There are good and efficient ways of doing this, but yours can be quick and dirty. Next compute the mean and covariance of the data so generated. Does it bear a passing resemblance to what you started out with? Can you account for any discrepancies? Draw the one standard deviation ellipse for your data.

2.
Use the above program to generate a `gaussian hand' as in Fig. 4.1. Try to use the EM algorithm with random initialisation to find a maximum likelihood fit for six gaussians. What conclusions do you draw about the algorithm?

3.
Get someone to use your program to generate a mixture of gaussians and then try to fit the mixture without knowing how many terms in the mixture. Use the AIC and the Rissanen Stochastic Complexity measures to find the optimum number of model elements. Try this with varying numbers of points and varying numbers of mixture terms in various dimensions. Do the results surprise you? Depress you?

4.
Photocopy an image of a real hand by putting your palm on the photocopier, cut it out, paint it black and scan it into a tif file. Use the EM algorithm to represent the pixel array as a gaussian mixture model. Can you find an initialisation process which gives reasonable results regardless of the orientation or position of the hand in the image? Are the results worse than for the gaussian hand?

5.
The six quadratic forms specifying the components of a hand comprise six points in ${\fam11\tenbbb R}^5$.Compute the centre and covariance matrix for these points. This gives 5 numbers for the centre and a further 15 numbers for the covariance matrix. This makes a hand into a point in ${\fam11\tenbbb R}^{20}$. Repeat with a new handprint in the same location and the same orientation, and with the same handedness, i.e. stick to right hand prints, giving a set of points in ${\fam11\tenbbb R}^{20}$. Now fit a gaussian to these points. Repeat by turning the paper hand over so as to reverse its `handedness'. Obtain a similar cluster for these mirror hands. Now test to see if you can classify left hands and distinguish them from right hands by looking to see which cluster they fall closest to. How badly do things go wrong if the fingers are sometimes together? (With judicious and ingenious initialisation, it is possible to get consistent fits and reliable discrimination.)

6.
Repeat with paper cut-outs of aeroplane silhouettes, and see if you can distinguish between a fighter and a passenger jet. Make them about the same size, so it isn't too easy.

7.
Try gaussian mixture modelling an image of some Chinese characters. Can you use the same trick of describing the gaussians by computing gaussians for them (UpWriting) in order to tell different characters apart?

8.
Compute Zernike moments for some characters of printed English text and collect examples of the same character. Examine the clustering for variants of the characters by projecting down onto the computer screen (the programs with the book will save you some coding if you have a workstation). Compare the character `9' with a comma. Amend the Zernike moments to give a size measure.

9.
Construct a program which does printed Optical Character Recognition. You will find problems with characters joined together and characters which are disconnected; you will find limited font independence a problem. This is a lot of work but very educational, and later exercises will allow you to extend the scope of it considerably. It should be possible to get moderately good recogition results for clean text correctly oriented by several of the methods outlined in the preceding chapters.


next up previous contents
Next: Bibliography Up: Decisions: Statistical methods Previous: Summary of Chapter
Mike Alder
9/19/1997