The belief of Minsky in the late seventies appeared
to be that what was needed was an effective
way of training multilayer perceptrons. The quick
fix committee nets of Nilsson were ignored for
reasons that seem to be accidental. Training a
single neuron by the method of Rosenblatt
(and Widrow, and others) didn't appear to generalise
in a way that people thought right and
natural. Certainly, training the committee net
was found experimentally not to always work,
but
the ad hoc fix of adding more units usually
managed to give success. Just why was not
entirely clear. And then came the the idea of
following the linear summation by a smooth
approximation to the step function that
had been used up to this point. This allowed
one to use
a little elementary calculus, together with a
few minor fudges, to train nets of arbitrary
complexity. Actually, what it really did was to
give a rationale for doing what was being done
anyway, but such is the power of myth that a belief
was born that the new neural nets were the
answer to the prayers that the Artificial Intelligentsia
muttered to their heathen and
doubtless robotic gods. The triumph of hope over
experience seems to be an enduring feature of
human existence.
The Back-Propagation algorithm then allows one to train multilayer perceptrons, the species of Neural Net to which you were introduced in the first chapter. They don't always converge to a solution, but it is found experimentally that adding a few extra units often works, just as with committee nets. This is not a coincidence. It is also found that the net can learn only rather simple patterns in a reasonable time, and that huge times are needed for the case where the data set is at all complicated. I shall discuss the algorithm a little later, and explain why it behaves as it does.
At the same time, the neurophysiologists were finding out what was being done and were less than enthusiastic about MLPs as models of anything inside anybody's head. The evidence against the core idea of neurons as adaptive weighted majority gates (which fired provided that the weighted sum of inputs exceeded some threshold), has been accumulating. While some may possibly work in this way, there seems to be much more than that going on in brains.
Other types of neural net were investigated around this time, and a number have survived. I shall deal with them rather briefly in later sections. Among them, the Kohonen net, the Adaptive Resonance net of Grossberg, the Probabilistic Neural Nets of Specht, the Hopfield nets and various other `architectures' such as the Neocognitron are going to get some attention.
In surveying the plethora of Neural Nets at the present time, there is a strong sense of having entered a menagerie, and moreover one on which no taxonomist has been consulted. I shall try to reduce the sense of anomie and focus on what kinds of things these nets accomplish, rather than on drawing wiring diagrams, which appear to be the main conceptual tool offered to the perplexed student.
My collaborator of many years, Dr. Chris deSilva, has written an elegant account of certain of these and has kindly given me permission to include a substantial article he has written explaining how some of them work.