next up previous contents
Next: Lots of Gaussians: The Up: One Gaussian per cluster Previous: One Gaussian per cluster

Dimension 2

In dimension two, it is often convenient, in these days of nice computer graphics, to draw the ellipse and display the data and the ellipse on a screen. This can conveniently be accomplished by finding a parametric representation of the ellipse.

To see how this may be done, shift back to the origin by subtracting ${ {\bf m}}$ from everything. We now have an ellipse given by

\begin{displaymath}
\{ {\bf x}\in {\fam11\tenbbb R}^2 : {\bf x}^T {\bf V}^{-1} {\bf x}= 1\} \end{displaymath}

Now suppose ${\bf V^{-1}}$ has a square root ${\bf A}$ which, like ${\bf V}$ and ${\bf V^{-1}}$ is symmetric. Then we can write the above equation as

\begin{displaymath}
\{ {\bf x}\in {\fam11\tenbbb R}^2 : {\bf x}^T {\bf A}^T {\bf A}{\bf x}= 1\} \end{displaymath}

since $ {\bf A}^T {\bf A}= {\bf A}{\bf A}= {\bf V}^{-1} $.

Now if $ {\bf y}= {\bf A}{\bf x}$ we have $ {\bf y}^T {\bf y}= 1$, which means that ${\bf y}$ is on the unit circle, S1, in ${\fam11\tenbbb R}^2$. We can therefore describe the ellipse as

\begin{displaymath}
\{ {\bf x}\in {\fam11\tenbbb R}^2 : {\bf y}= {\bf A}{\bf x}\ \& \ {\bf y}\in 
S^1 \} \end{displaymath}

or

\begin{displaymath}
\{ {\bf x}\in {\fam11\tenbbb R}^2 : {\bf x}= {\bf A}^{-1} {\bf y}\ \& \ {\bf y}
\in S^1 \} \end{displaymath}

or

\begin{displaymath}
% latex2html id marker 3604
\left\{ {\bf x}\in {\fam11\tenbb...
 ...y}
{c} cos(\theta)\\  sin(\theta)\end{array} 
\right) \right\} \end{displaymath}

It is easy to draw sets on a computer when they have been parametrised by a single real number. The following few lines of C code indicate how we can trace the path of a circle with time running from 0 to $2\pi $, or at least a discrete approximation to it:

for(int_time = 0; int_time < 629; int_time++){ 

   time = int_time/100;
   x =  100* cos(time) + 200;
   y =  200 - 100*sin(time); 
   putpixel(x,y);
};
This will draw a circle at centre (200,200) of radius 100. The choice of 629 calls for some explanation which I leave to the reader. The drawing ellipse, $ {\bf A}^{-1}$ is simply $ {\bf V}^{\frac{1}{2}} $ which we have already seen how to compute. This operates on the points of the unit circle to stretch them in the right way to draw a one standard deviation ellipse: if you want more than one standard deviation, the modification is straightforward.

The result of generating points according to a few gaussian distributions (well, more or less. It was faked, of course) in the plane and displaying them on a computer is shown in Fig.4.1. Six gaussians were chosen so as to produce a ghostly hand which may be seen with the eye of faith.


 
Figure 4.1: Simulated gaussians in ${\fam11\tenbbb R}^2$.
\begin{figure}
\vspace{8cm}
\special {psfile=patrecfig4.1.ps}\end{figure}

The ellipses representing not one but 3 standard deviations are (mostly) drawn in the next diagram, Fig.4.2, where they should satisfy the most captious that they are doing something to represent the distribution of the points in the data set all on their own.


 
Figure 4.2: Ellipses drawn at three standard deviations for the above data.
\begin{figure}
\vspace{8cm}
\special {psfile=patrecfig4.2.ps}\end{figure}


next up previous contents
Next: Lots of Gaussians: The Up: One Gaussian per cluster Previous: One Gaussian per cluster
Mike Alder
9/19/1997