next up previous contents
Next: Applications Up: Hopfield Networks Previous: The Network Equations

Theory of the Network

The theory of the network is an extension and formalization of the billiard table model described above. Instead of the surface of the billiard table, we consider an energy surface over the state space of the network.

The state of a network with N units at time t can be described by a vector $(\mu_1(t), \ldots , \mu_N(t))$, where $\mu_i(t)$ is the output of the ith unit at time t. We define the energy at time t to be

\begin{displaymath}
E(t) = - \frac{1}{2} \sum_i \sum_{j\neq i} w_{ij} 
\mu_i(t) \mu_j(t)
 + \sum_i \mu_i(t) \theta_i\end{displaymath}

where $\theta_i$ is the threshold of the ith unit.

The energy of the network is analogous to the potential energy of a physical system. In the billiard table model, it corresponds to the difference between the height of the surface of the table at a point and the height of the nearest pocket. The choice of the energy function makes the Hopfield network essentially the same as a Ising model for spin glasses, which makes it of great interest to physicists.

The operation of the Hopfield network may be summed up as energy minimization. The choice of the network weights ensures that minima of the energy function occur at (or near) points representing exemplar patterns. The formula used to update the states ensures that the energy of each state is no greater than the energy of the preceeding state. The network rolls down the energy surface to the nearest exemplar pattern.

There are two methods for updating the states of the units. In the sequential method, the states of the units are changed one at a time, in random order. In the simultaneous method, the states of the units are all changed at the same time, on the basis of their states after the previous update.

It is easy to show that the sequential updating method decreases the energy at each step. Suppose that at time t, the state of the kth unit is changed. If the change in the state of the ith unit is $\Delta\mu_i(t+1) = \mu_i(t+1)-\mu_i(t)$, then we have $\Delta\mu_i(t+1) = 0$for $i \neq k$, and $\Delta\mu_k(t+1) \neq 0$.

The change in the energy caused by the change in the state of the kth unit is

\begin{displaymath}
\Delta E = - \frac{1}{2} \sum_i \sum_j w_{ij}
 (\mu_i(t+1) \...
 ...\mu_i(t) 
\mu_j(t))
 + \sum_i \theta_i (\mu_i(t+1) - \mu_i(t)).\end{displaymath}

Adding and subtracting $\mu_i(t+1) \mu_j(t)$, and simplifying, we get

\begin{displaymath}
\Delta E = - \frac{1}{2} \sum_i \sum_j w_{ij}
 (\mu_i(t+1) \...
 ...Delta \mu_i(t+1) \mu_j(t))
 + \sum_i \theta_i \Delta\mu_i(t+1),\end{displaymath}

which can be reduced to

\begin{displaymath}
\Delta E = - \Delta\mu_k \left( \sum_i w_{ik} 
\mu_i - \theta_k \right),\end{displaymath}

using the facts that $\Delta\mu_i(t+1) = 0$ for $i \neq k$ and wij = wji.

The term in brackets is the same as the argument of the threshold function of the kth unit, so if its value is positive, the value of $\Delta\mu_k(t+1)$ is also positive, so $\Delta E$ is negative. If the value of the argument of the threshold function is negative, so is $\Delta\mu_k(t+1)$, and again $\Delta E$is negative. Hence each change of state brought about by the sequential updating method reduces the value of the energy function.

The simultaneous updating method also leads to a reduction in the energy at each stage, but this is more difficult to show.

Since the operation of the network ensures that its energy is reduced at each update, it follows that the network will tend towards states that are local minima of the energy function. The positions of these minima are determined by the weights wij. The formula used to determine the weights from the exemplar patterns ensures that local minima of the energy function will occur at positions corresponding to the exemplar patterns, provided the number of exemplars is small in comparison to the number of units. It is usually stated that the number of patterns that can be effectively stored in the network is of the order of 15 per cent of the number of units. This makes the Hopfield network relatively inefficient as a memory device.


next up previous contents
Next: Applications Up: Hopfield Networks Previous: The Network Equations
Mike Alder
9/19/1997