Given a gaussian distribution containing the quadratic form
![]()
A Riemannian Metric on the space
is specified by assigning a positive definite,
symmetric,
quadratic form to every point
of the space. If you have a curve in the space,
you can take a set of points along it, and compute
the
Mahalanobis distance of each point from its predecessor,
then add them up. Doing this with more and
more points gives, in the limit, the length of
the curve in the given metric. To actually compute
the
distance between two points in the given metric,
take all curves joining them and find the length
of the
shortest.
The assignment of a quadratic form to each point of a space constitutes an example of a tensor field, and classical physics is full of them. General Relativity would be impossible without them.
We are not going to obtain a Riemannian metric from a finite data set without a good deal of interpolation, but there are occasions when this might be done. In the case of fig.1.4 for example, we can argue that there are grounds for putting on the same quadratic form everywhere, which is equivalent to squashing the space along the X-axis until the ellipses fitting the data turn into circles, and then using the ordinary Euclidean distance.
It is generally quicker to compute Mahalanobis distances (if not say them) than likelihoods since we save an exponentiation, and simply using the Mahalanobis distance to the various centres and choosing the smallest may be defended as (a) quick, (b) using a more defensible metric and (c) containing less compromising assumptions. If the determinants of the different forms are all the same, this will give the same answers anyway. And if they do not differ by very much, we can argue that we are kidding ourselves if we are pretending they are really known accurately anyway, so why not go for the quick and dirty?
It is not uncommon to take the logarithms of the likelihoods, which amounts to just the Mahalanobis distance plus a correction term involving the determinant, and since the logarithm is a monotone function, this will give precisely the same answers as the likelihood if we are just choosing the greater likelihood source. The same simplification may be introduced in the case of gaussian mixture models.