Next: Scanline intersections and weights
Up: Measurement practice
Previous: Measurement practice
If we normalise into, say, an 11 by 9 array, we
can rewrite the characters into standard form.
Then we could, if desperate for ideas, take each
character as a point in
. This is not
a good idea, although it has been done. The main
reason it is not a good idea is because one
extra pixel in the wrong place can give vectors
which are totally unrelated: a single pixel
can shift a vertical line one pixel to the right
with no trouble at all. It would be nice
if a horizontal shift of a character by one pixel
column were to have minimal effect on the
placing of the point in
. Another reason
it is a bad idea, is that the dimension of the
space should be kept as small as possible. The
reasons for this are subtle; basically we want
to use our data to estimate model parameters,
and the bigger the ratio of the size of the data
set to the number of parameters we have to estimate,
the more we can feel we have genuinely
modelled something. When, as with some neural
net enthusiasts, there are more parameters than
data, we can only groan and shake our heads. It
is a sign that someone's
education has been sadly neglected. How much faith
would you have in the neural net B of
Fig.1.7 being able to do a decent job of
predicting new points, based as it is on only
two data points? How would you expect it to perform
as more points are obtained? How much faith would
you have in the rightness of something like B
if the dimension were 99 instead of 2, more faith
or less?
In the first chapter I remarked that the image
on my computer screen could be regarded as a
point in a space of dimension nearly four million, and
that I didn't assert that this was a good idea.
Suppose you wanted to write a program which could
distinguish automatically between television
commercials and monster movies. Doing this by trying
to classify ten second slices of images using
a neural net is something which might be contemplated
by a certain sort of depraved mind. It
would have to be pretty depraved, though. I shall
return to this issue later when discussing
model complexity.
Next: Scanline intersections and weights
Up: Measurement practice
Previous: Measurement practice
Mike Alder
9/19/1997