next up previous contents
Next: Dynamic Patterns Up: Basic Concepts Previous: CART et al

Clustering: supervised v unsupervised learning

The reflective reader will, perhaps, have been turning to the not so silly question of how he or she tells men from women. Or to put it another way, looking at the clusters of points in Fig.1.2., if instead of having labelled one set as X points for males and O points for females, suppose we had just drawn unlabelled points as little black dots: could a program have looked at the data and seen that there are two populations? It seems reasonable to suppose that the reader, with somewhat different sensory apparatus, has some internal way of representing human beings via neurons, or in wetware as we say in the trade, and that this shares with ${\fam11\tenbbb R}^n$ the capacity for coding resemblance or similarity in terms of proximity. Then the dimension may be a little higher for you, dear reader, but most of the problem survives.

There is, undeniably, a certain amount of overlap between the two clusters when we measure the weight and height, and indeed there would be some overlap on any system of measurement. It is still the case however (despite the lobbying of those who for various reasons prefer not to be assigned an unambiguous sex) that it is possible to find measurement processes which do lead to fairly well defined clusters. A count of X and Y chromosomes for example. Given two such clusters, the existence of the categories more or less follows.

One of the reasons for being unhappy with the neural net model we have described is that it is crucially dependent on the classification being given by some external agent. It would be nice if we had a system which could actually learn the fact that women and men are distinguishable categories by simply noticing that the data form two clusters.

It has to be assumed that at some point human beings learn to classify without any immediate feedback from an external agent. Of course, kicking a neuron when it is wrong about a classification, and kicking a dog when it digs up the roses have much the same effect; the devastation is classified as `bad' in the mind of the dog, or at least, the owner of the rose bush hopes so. It is not too far fetched to imagine that there are some pain receptors which act on neurons responsible for classifying experiences as `good' and `bad' in a manner essentially similar to what happens in neural nets. But most learning is a more subtle matter than this; a sea anemone `learns' when the tide is coming in without getting a kick in the metaphorical pants. Mistaking a man for a woman or vice versa might be embarrassing, but it is hard to believe you learnt the difference between men and women by making many errors and then reducing the average embarrassment, which is how an artificial neuron of the classical type would do it.

Learning a value, a +1 or -1 for each point in some set of points, and then being asked to produce a rule or algorithm for guessing the value at some new point, is most usefully thought of as fitting a function to a space when we know its value on a finite data set. The function in this case can take only binary values, $\pm 1$, but this is not in principle different from drawing a smooth curve (or surface) through a set of points. The diagram Fig.1.11. makes it clear, in one dimension, that we are just fitting a function to data.


 
Figure 1.11: A one dimensional pattern recognition problem solved by a neural net.
\begin{figure}
\vspace{8cm}
\special {psfile=patrecfig11.ps}\end{figure}

This perspective can be applied to the use of nets in control theory applications, where they are used to learn functions which are not just binary valued.

So Supervised Learning is function fitting, while Unsupervised Learning is cluster finding. Both are important things to be able to do, and we shall be investigating them throughout this book.


next up previous contents
Next: Dynamic Patterns Up: Basic Concepts Previous: CART et al
Mike Alder
9/19/1997