next up previous contents
Next: Neural Net Methods (Old Up: Decisions, decisions.. Previous: Decisions, decisions..

Metric Methods

One of the simplest methods is to find the closest point of the labelled set of points to the new point P, and assign to the new point whatever category the closest point has. So if (for the data set of guys and gals) the nearest point to P is an X, then we conclude that P should be a man. If a rationale is needed, we could argue that the measurement process is intended to extract important properties of the objects, and if we come out with values for the readings which are close together, then the objects must be similar. And if they are similar in respect of the measurements we have made, they ought, in any reasonable universe, to be similar in respect of the category they belong to as well. Of course it isn't clear that the universe we actually live in is the least bit reasonable.

Such a rationale may help us devise the algorithm in the first place, but it may also allow us to persuade ourselves that the method is a good one. Such means of persuasion are unscientific and frowned upon in all the best circles. There are better ways of ensuring that it is a good method, namely testing to see how often it gives the right answer. It is noteworthy that no matter how appealing to the intuitions a method may be, there is an ultimate test which involves trying it out on real data. Of course, rationales tend to be very appealing to the intuitions of the person who thought of them, and less appealing to others. It is, however, worth reflecting on rationales, particularly after having looked at a bit more data; sometimes one can see the flaws in the rationales, and devise alternative methods.

The metric method is easy to implement in complete generality for n measurements, we just have to go through the whole list of points where we know the category and compute the distance from the given point P. How do we do this? Well, the usual Euclidean distance $d({\bf x},{\bf y})$ between the vectors % latex2html id marker 667
$ \left(\begin{array}
{c} x^1 \\  x^2 \\  . \\  x^n 
 \end{array} \right) $ and % latex2html id marker 669
$ \left(\begin{array}
{c} y^1 \\  y^2 \\  . \\  y^n 
 \end{array} \right) $ is simply $ \sqrt{\sum_{i=1}^n (x^i - y^i)^2} $, which is easy to compute. Now we find that point x for which this distance from the new point P is a minimum. All that remains is to note its category. If anyone wants to know where the formula for the euclidean distance comes from in higher dimensions, it's a definition, and it gives the right answers in dimensions one, two and three. You have a better idea?


 
Figure 1.3: X is male, O is female, what is this P?
\begin{figure}
\vspace{8cm}
\special {psfile=patrecfig3.ps}\end{figure}

Reflection suggests some drawbacks. One is that we need to compute a comparison with all the data points in the set. This could be an awful lot. Another is, what do we do in a case such as Fig.1.3., above, where the new point P doesn't look as if it belongs to either category? An algorithm which returns `Haven't the faintest idea, probably neither' when asked if the P of Fig.1.3. is a man or a woman would have some advantages, but the metric method needs some modification before it can do this. It is true that P is a long way from the closest point of either category, but how long is a long way?

Exercise: Is P in Fig.1.3 likely to be (a) a kangaroo or (b) a pole vaulter's pole?

A more subtle objection would occur only to a geometer, a species of the genus Mathematician. It is this: why should you use the euclidean distance? What is so reasonable about taking the square root of the sum of the squares of the differences of the co-ordinates? Sure, it is what you are used to in two dimensions and three, but so what? If you had the data of Fig.1.4. for example, do you believe that the point P is, on the whole, `closer to' the X's or the O's?


 
Figure 1.4: Which is P closer to, the X's or the O's?
\begin{figure}
\vspace{8cm}
\special {psfile=patrecfig4.ps}\end{figure}

There is a case for saying that the X-axis in Fig.1.4. has been stretched out by something like three times the Y-axis, and so when measuring the distance, we should not give the X and Y coordinates the same weight. If we were to divide the X co-ordinates by 3, then P would be closer to the X's, whereas using the euclidean distance it is closer to the O's.

It can come as a nasty shock to the engineer to realise that there are an awful lot of different metrics (ways of measuring distances) on ${\fam11\tenbbb R}^2$, and the old, easy one isn't necessarily the right one to use. But it should be obvious that if we measure weight in kilograms and height in centimetres, we shall get different answers from those we would obtain if we measured height in metres and weight in grams. Changing the measuring units in the above example changes the metric, a matter of very practical importance in real life. There are much more complicated cases than this which occur in practice, and we shall meet some in later sections, when we go over these ideas in detail.

Remember that this is only the mickey-mouse, simple and easy discussion on the core ideas and that the technicalities will come a little later.


next up previous contents
Next: Neural Net Methods (Old Up: Decisions, decisions.. Previous: Decisions, decisions..
Mike Alder
9/19/1997