next up previous contents
Next: Occlusion and other metric Up: Syntactic Pattern Recognition Previous: Intrinsic and Extrisic Chunking

Backtrack

This chapter started with the observation that there is a fundamental issue of segmentation, which should properly be part of the classification process: a little thought shows that we have indeed the possibility of incorporating the two matters in a coherent way. If we proceed to extract structure from a postcode which has the stroke of the five distinct from the body, then we can easily see that there will be a stroke cluster corresponding to the digit `5' provided only that there have been lots of 5's offered to the system. Moreover, cases where the connection is made and where it is not quite closed will simply be variants of the same cluster of three strokes. Since the stroke triple consisting of all the strokes for the digit `5' will occur with relatively high frequency, we will get a cluster of these triples.

In effect then, the system under discussion will do automatically the aggregating into what sets should become segments, hierarchically, under the accumulation of data. We have disposed of the segmentation problem.

There were two simple data sets which all existing classification systems get wrong. It was asked if we could find a generic `fix' which would do better than the naive clustering methods discussed thus far under the heading of static pattern classification.

The answer is gratifyingly positive. Consider first the chess-board pattern. At the lowest level of aggregation, we find neighbourhoods in a small ball correlate with other adjacent small balls provided the diameter of each ball is small compared with the size of a square. The points are technically in ${\fam11\tenbbb R}^3$, with the last component either +1 or -1 depending on the colour or type of point. The entropic chunking will thus collect all the points of a square together for the next level of UpWrite.

The next level now represents the alternating pattern. In a suitable space of square regions, perhaps crudely represented by fourth order moments of a square in ${\fam11\tenbbb R}^2$, we get alternating blocks of points in the base level. Taking any one such point and looking at a neighbourhood, we get, as we go across the image, very tight clustering: we keep getting a point which is a cross being surrounded by opposite category points along ranks and files, and diagonals which are of the same type if the neighbourhood is slightly bigger. This structure is in effect learnt from other parts of the same data set. Using such a neighbourhood descriptor, the entire image is decomposed into iterates of this primitive. The system learns the process by which it appears to have been generated in the same way as the bag structure ABCDABCDABCD was learnt in the last chapter.

Classifying a point in a region is done by downwriting from the higher level description which will fill the empty square with points which although not present in fact are deduced from the higher level structure. Now we can determine the category of the unknown point by looking at its neighbours- not the actual neighbours but the virtual neighbours generated by the DownWrite. This is analogous to identifying the word profes5or as being close to the word professor which has been seen before, DownWriting the word to its spelling (a rather trivial operation), and then identifying the symbol between s and o

We are using the same machinery in both cases, just doing it with different interpretations of the terms `symbol' and `concatenate' in fact. This point requires some mathematical sophistication not required of the reader of this book.[*]

In the same way, the double spiral data set is a set of points in ${\fam11\tenbbb R}^3$, the first two coordinates telling you where the data is and the last digit telling you if it is a nought or a cross. The local structure is quite different from the case of the chess-board pattern, but the same machinery works. The set of noughts is locally described by a set of ellipses, and local groups of, say three ellipses may be represented by giving their location relatively, or this information may be inferred from the canonical UpWrite via moments of the points in ${\fam11\tenbbb R}^6$. The relationship will in fact change slightly, but predictably, as the data gets further from the centre. Extrapolation is therefore possible. This is done automatically by simply inferring the structure `rule' and DownWriting to get the distribution of points which the top level coding will generate. Again, the classification is accomplished by the same machinery, we look at the virtual neighbouring data obtained from the DownWrite.


next up previous contents
Next: Occlusion and other metric Up: Syntactic Pattern Recognition Previous: Intrinsic and Extrisic Chunking
Mike Alder
9/19/1997