next up previous contents
Next: Structured Patterns Up: Basic Concepts Previous: Clustering: supervised v unsupervised

Dynamic Patterns

The above classification of learning systems into supervised and unsupervised , function fitting and clustering, although lacking in formal precision, is of some intuitive value. We are, of course, conducting a leisurely survey of the basic concepts at present, rather than getting down to the nitty-gritty and the computational, because it is much easier to get the sums right when you can see what they are trying to accomplish.

The framework discussed so far, however, has concentrated on recognising things which just sit there and wait to be recognised; but many things change in time in distinctive ways. As an example, if we record the position and possibly the pressure of a stylus on a pad, we can try to work out what characters are being written when the user writes a memo to himself. This gives us a trajectory in dimension two to classify. Or we might have an image of a butterfly and a bird captured on videotape, and wish to identify them, or, more pressingly, two kinds of aeroplane or missile to distinguish. In these cases, we have trajectories in ${\fam11\tenbbb R}^2$ or possibly ${\fam11\tenbbb R}^3$ as the objects to be recognised. A similar situation occurs when we recognise speech, or try to: the first thing that is done is to take the time sequence which gives the microphone output as a function of time and to perform some kind of analysis of its component frequencies, either by a hardware filter bank, an FFT (Fast Fourier Transform) followed by some binning so as to give a software simulation of the hardware filterbank, or relatively exotic methods such as Cepstral Coefficients or Linear Predictive Coding coefficients. All of these transform the utterance into a trajectory in some space ${\fam11\tenbbb R}^n$ for n anywhere between 2 and 256. Distinguishing the word `yes' from the word `no', is then essentially similar to telling butterflies from birds, or boeings from baseballs, on the basis of their trajectory characteristics.

An even more primitive problem occurs when one is given a string of ascii characters and has to assign provenance. For example, if I give you a large sample of Shakespearean text and a sample of Marlowe's writing, and then ask you to tell me what category does a piece written by Bacon come under, either or neither, then I am asking for a classification of sequences of symbols. One of the standard methods of doing Speech Recognition consists of chopping up the space of speech sounds into lumps (A process called vector quantisation in the official documents) and labelling each lump with a symbol. Then an utterance gets turned first into a trajectory through the space, and then into a sequence of symbols, as we trace to see what lump the trajectory is in at different times. Then we try to classify the symbol strings. This might seem, to the naive, a bizarre approach, but it might sound more impressive if we spoke of vector quantisation and Hidden Markov Models. In this form, it is more or less a staple of speech recognition, and is coming into favour in other forms of trajectory analysis.[*]

The classification of trajectories, either in ${\fam11\tenbbb R}^n$ or in some discrete alphabet space, will also therefore preoccupy us at later stages. Much work has been done on these in various areas: engineers wanting to clean up signals have developed adaptive filters which have to learn properties of the signal as the signal is transmitted, and statisticians and physicists have studied ways to clean up dirty pictures. Bayesian methods of updating models as data is acquired, look very like skeletal models for learning, and we shall be interested in the extent to which we can use these ideas, because learning and adaption are very much things that brains do, and are a part of getting to be better at recognising and classifying and, in the case of trajectories, predicting.


next up previous contents
Next: Structured Patterns Up: Basic Concepts Previous: Clustering: supervised v unsupervised
Mike Alder
9/19/1997