Michael D. Alder
September 19, 1997
Preface
Automation, the use of robots in industry, has
not progressed with the speed that many had
hoped it would. The forecasts of twenty years
ago are looking fairly silly today: the fact
that they were produced largely by journalists
for the benefit of boardrooms of accountants
and MBA's may have something to do with this,
but the question of why so little has been
accomplished remains.
The problems were, of course, harder than they looked to naive optimists. Robots have been built that can move around on wheels or legs, robots of a sort are used on production lines for routine tasks such as welding. But a robot that can clear the table, throw the eggshells in with the garbage and wash up the dishes, instead of washing up the eggshells and throwing the dishes in the garbage, is still some distance off.
Pattern Classification, more often called Pattern Recognition, is the primary bottleneck in the task of automation. Robots without sensors have their uses, but they are limited and dangerous. In fact one might plausibly argue that a robot without sensors isn't a real robot at all, whatever the hardware manufacturers may say. But equipping a robot with vision is easy only at the hardware level. It is neither expensive nor technically difficult to connect a camera and frame grabber board to a computer, the robot's `brain'. The problem is with the software, or more exactly with the algorithms which have to decide what the robot is looking at; the input is an array of pixels, coloured dots, the software has to decide whether this is an image of an eggshell or a teacup. A task which human beings can master by age eight, when they decode the firing of the different light receptors in the retina of the eye, this is computationally very difficult, and we have only the crudest ideas of how it is done. At the hardware level there are marked similarities between the eye and a camera (although there are differences too). At the algorithmic level, we have only a shallow understanding of the issues.
Human beings are very good at learning a large amount of information about the universe and how it can be treated; transferring this information to a program tends to be slow if not impossible.
This has been apparent for some time, and a great deal of effort has been put into research into practical methods of getting robots to recognise things in images and sounds. The Centre for Intelligent Information Processing Systems (CIIPS), of the University of Western Australia, has been working in the area for some years now. We have been particularly concerned with neural nets and applications to pattern recognition in speech and vision, because adaptive or learning methods are clearly of great potential value. The present book has been used as a postgraduate textbook at CIIPS for a Master's level course in Pattern Recognition. The contents of the book are therefore oriented largely to image and to some extent speech pattern recognition, with some concentration on neural net methods.
Students who did the course for which this book was originally written, also completed units in Automatic Speech Recognition Algorithms, Engineering Mathematics (covering elements of Information Theory, Coding Theory and Linear and Multilinear algebra), Artificial Neural Nets, Image Processing, Sensors and Instrumentation and Adaptive Filtering. There is some overlap in the material of this book and several of the other courses, but it has been kept to a minimum. Examination for the Pattern Recognition course consisted of a sequence of four micro-projects which together made up one mini-project.
Since the students for whom this book was written had a variety of backgrounds, it is intended to be accessible. Since the major obstructions to further progress seem to be fundamental, it seems pointless to try to produce a handbook of methods without analysis. Engineering works well when it is founded on some well understood scientific basis, and it turns into alchemy and witchcraft when this is not the case. The situation at present in respect of our scientific basis is that it is, like the curate's egg, good in parts. We are solidly grounded at the hardware level. On the other hand, the software tools for encoding algorithms (C, C++, MatLab) are fairly primitive, and our grasp of what algorithms to use is negligible. I have tried therefore to focus on the ideas and the (limited) extent to which they work, since progress is likely to require new ideas, which in turn requires us to have a fair grasp of what the old ideas are. The belief that engineers as a class are not intelligent enough to grasp any ideas at all, and must be trained to jump through hoops, although common among mathematicians, is not one which attracts my sympathy.
Instead of exposing the fundamental ideas in algebra (which in these degenerate days is less intelligible than Latin) I therefore try to make them plain in English.
There is a risk in this; the ideas of science or engineering are quite diferent from those of philosophy (as practised in these degenerate days) or literary criticism (ditto). I don't mean they are about different things, they are different in kind. Newton wrote `Hypotheses non fingo', which literally translates as `I do not make hypotheses', which is of course quite untrue, he made up some spectacularly successful hypotheses, such as universal gravitation. The difference between the two statements is partly in the hypotheses and partly in the fingo. Newton's `hypotheses' could be tested by observation or calculation, whereas the explanations of, say, optics, given in Lucretius De Rerum Naturae were recognisably `philosophical' in the sense that they resembled the writings of many contemporary philosophers and literary critics. They may persuade, they may give the sensation of profound insight, but they do not reduce to some essentially prosaic routine for determining if they are actually true, or at least useful. Newton's did. This was one of the great philosophical advances made by Newton, and it has been underestimated by philosophers since.
The reader should therefore approach the discussion about the underlying ideas with the attitude of irreverence and disprespect that most engineers, quite properly, bring to non-technical prose. He should ask: what procedures does this lead to, and how may they be tested? We deal with high level abstractions, but they are aimed always at reducing our understanding of something prodigiously complicated to something simple. This is quite different from Literary Criticism, for example, which is aimed at concealing the awful truth that this is all for people who like reading story books and arguing about them. Good clean fun, no doubt, but not, on the face of it, a serious activity for an adult.
It is necessary to make some assumptions about the reader and only fair to say what these are. I assume, first, that the reader has a tolerably good grasp of Linear Algebra concepts. The concepts are more important than the techniques of matrix manipulation, because there are excellent packages which can do the calculations if you know what to compute. After the work of the educational theorists, it would in any case be otiose to suppose in the reader any great degree of technical competence in Mathematics, so I don't.
I assume, second, a moderate familiarity with elementary ideas of Statistics, and of contemporary mathematical notation, such as any Engineer or Scientist will have encountered in a modern undergraduate course. I assume, finally, the kind of general exposure to computing terminology familiar to anyone who can read, say, Byte magazine, and also that the reader can program in PASCAL or C.
I do not assume that the reader is of the male sex. I use the pronoun `he' in referring to the reader because it saves a letter and is the convention for the generic case. The proposition that this will depress some women readers to the point where they will give up reading and go off and become subservient housewives does not strike me as sufficiently plausible to be worth considering further. The feminist claim `I would have been an original, creative, independent mind were it not for the subversive effect of peer group pressure' has never struck me as having the slightest credibility. Anybody who wanted to understand the world and was discouraged from doing so by other people's linguistic habits couldn't have cared much. In the past, there have been difficulties put in the way of girls and women going in to science and engineering, but they were more substantive than people's choice of pronoun, and they aren't there any more.
This is intended to be a happy, friendly book. It is written in an informal, one might almost say breezy, manner, which might well irritate the humourless and those possessed of a conviction that intellectual respectability entails stuffiness. I used to believe that all academic books on difficult subjects were obliged for some mysterious reason to be oppressive, but a survey of the better writers of the past has shown me that this is in fact a contemporary habit and in my view a bad one. I have therefore chosen to abandon a convention which must drive intelligent people away from Science and Engineering and Mathematics in large numbers. The book has jokes, opinionated remarks and pungent value judgements in it, which might serve to entertain readers and keep them on their toes, so to speak. They may also irritate a few who believe that the pretence that the writer has no opinions should be maintained even at the cost of making the book boring. What this convention usually accomplishes is a sort of bland porridge which discourages critical thought about assumptions, and thought about fundamental assumptions is precisely what this area badly needs. So I make no apology for the occasional provocative judgement; argue with me if you disagree. The judgements are, of course, my own; CIIPS and I are not responsible for each other.
I am most grateful to my colleagues and students at the Centre for assistance in many forms; I have shamelessly borrowed their work as examples of the principles discussed herein. I must mention Dr. Chris deSilva with whom I have worked over many years, Miss Gek Lim whose energy and enthusiasm for Quadratic Neural Nets has enabled them to become demonstrably useful, and Professor Yianni Attikiouzel, director of CIIPS, without whom neither this book nor the course would have come into existence.