In order to focus ideas, I shall consider again the simple case of triangles which are all equilateral and which differ only in the scale. I shall suppose that we have a standard such triangle with the centroid at the origin, and that by taking a range of scalars between 0.5 and 1.5 I have shrunk or enlarged the basic triangle to give a set of images, one for each scalar.
If we UpWrite the triangles to points in
in the
standard way, we see that the set will constitute
an
embedding of the interval [0.5,1.5] in
.
The line
will be curved rather than straight, and in practice
we will
get a finite point set on or close to the idealised
embedding.
Doing the same thing with squares and hexagons will yield two more such curves. If the size of the standard object in all three cases is similar, then the curves will lie approximately on a space of `scaled planar objects'. Other objects, such as pentagons, heptagons, octagons and circles will also lie on such a space. This space is foliated by the curves representing the Lie group action. Given a new object, it is possible in principle to estimate the scaling curve upon which it lies, and to be able to recognise a scaled version of the object which has never been seen before. In other words, we may learn the transformation.
Less ambitious undertakings have been found to
work quite
satisfactorily: in an application involving the
recognition
of aircraft sillhouettes, the space of each curve
was found
for two aircraft, and a hyperplane fitted to each
of them.
Scalings from
to
of the original
size were used.
A new silhouette of one of the aircraft was then
presented to
the system, and it was classified according to
which of the
two hyperplanes was closest. By this means it
was found
possible to obtain correct classification down
to about
of the original size, at which point resolution
issues came into play. The system was capable
of learning
the space and performing creditable extrapolations
in order
to achieve a classification.
Other Lie Groups occurring naturally are the group of shifts or translations, and the group of rotations. The cartesian product of these spaces with a scaling space gives, for each planar object, a four dimensional manifold embedded in the top level space.
It is of course also possible to use invariant moments, but this presupposes that we know the appropriate group. If we contemplate the general situation, we see that this may be infeasible. For example, an image of a bar code printed on a can of beans is deformed in a way which is rather harder to specify, and the transformations to a trajectory in the speech space involved in (a) breathing over the microphone, (b) enlarging the vocal tract, (c) changing sex, we see that although we have group actions on the data, and although factoring out these group actions is highly desirable, an analytic description of the action is at least difficult and generally impossible to obtain.
A symmetric object will not generally give an
embedding, but
will give an immersion of the group
. If, for example,
we
rotate a square, the result of UpWriting will
not embed the
circle group SO(2) in the top level space, because
the path
traced out will recur after one quarter of a circuit.
We get
a circle mapped into the space, but we trace it
out four times. Given
noise, quantization and finite sampling, the result
may be
four very close lobes, each close to what is
topologically a circle.
This gives us a means of identifying symmetries
in the
object.
If we consider three dimensional rotations of
an object such
as an aircraft which is then projected onto a
plane, as when
we take a video image, we get, via the UpWrite
process, a
space which may not be a manifold, but which is
likely to be
one for almost all of the points. If the reader
thinks of
something like a circle with a diameter included,
he will
observe that the object is pretty much one dimensional,
in
that for most points, a sufficiently small neighbourhood
of the point will give a set which looks like
an interval in
. The two points where the diameter meets
the circle are
singular, but there are only two of them. For
most points of
the UpWritten image, a small perturbation derived
from a
rotation in three dimensions, will give another
point which
if DownWritten to the image set will give something
qualitatively the same. Imagine an aeroplane or
a cube `in
general position'. For some views of the cube
or the
aeroplane however, the image degenerates into
something
different from neighbouring views, as for example
when the
cube is seen along an axis orthogonal to a face
and we see a
square. There may be discontinuities in the UpWrite
at such
locations; nevertheless, the space is mainly a
manifold, it
may be UpWritten as an entity in its own right.
When this is
done with distinct classes of objects, we recover
at this
level separated objects which may conveniently
have the
symbol designating them replaced by an alphabetic
term,
since the toplogy has become trivial. Thus we
see that the
case of strings and symbols in the usual sense
may be
recovered as a degenerate case of the more general
topological
one.