next up previous contents
Next: Paradigms Up: Measurement and Representation Previous: From objects to points

Telling the guys from the gals

Suppose we take a large number of men and measure their height and weight. We plot the results of our measurements by putting a point on a piece of paper for each man measured. I have marked a cross on Fig.1.2. for each man, in such a position that you can easily read off his weight and height. Well, you could do if I had been so thoughtful as to provide gradations and units. Now I take a large collection of women and perform the same measurements, and I plot the results by marking, for each woman, a circle.


 
Figure 1.2: X is male, O is female, what is P?
\begin{figure}
\vspace{8cm}
\special {psfile=patrecfig2.ps}\end{figure}

The results as indicated in Fig.1.2. are plausible in that they show that on average men are bigger than and heavier than women although there is a certain amount of overlap of the two samples. The diagram also shows that tall people tend to be heavier than short people, which seems reasonable. Now suppose someone gives us the point P and assures us that it was obtained by making the usual measurements, in the same order, on some person not previously measured. The question is, do we think that the last person, marked by a P, is male or female?

There are, of course, better ways of telling, but they involve taking other measurements; it would be indelicate to specify what crosses my mind, and I leave it to the reader to devise something suitable. If this is all the data we have to go on, and we have to make a guess, what guess would be most sensible?

If instead of only two classes we had a larger number, also having, perhaps, horses and giraffes to distinguish, the problem would not be essentially different. If instead of working in dimension 2 as a result of choosing to measure only two attributes of the objects, men, women and maybe horses and giraffes, we were in dimension 12 as a result of choosing to measure twelve attributes, again the problem would be essentially the same- although it would be impracticable to draw a picture. I say it would be essentially the same; well it would be very different for a human being to make sense of lots of columns of numbers, but a computer program hasn't got eyes. The computer program has to be an embodiment of a set of rules which operates on a collection of columns of numbers, and the length of the column is not likely to be particularly vital. Any algorithm which will solve the two class, two dimensional case, should also solve the k class n dimensional case, with only minor modifications.


next up previous contents
Next: Paradigms Up: Measurement and Representation Previous: From objects to points
Mike Alder
9/19/1997