next up previous contents
Next: Applications Up: Neocognitron Previous: The Network Equations

Training the Network

The network is trained one layer at a time. For each layer, a set of training patterns is defined, and the desired responses of the units in the layer to those patterns is determined. The patterns are presented one at a time, and the weights on the connections to the units in the layer are adjusted until the units respond as required.

The training procedure differs from that of other networks in two ways. First, as mentioned above, the network is trained a layer at a time. The training is supervised, and directed towards having the units in each layer behaving in a predetermined way to each input pattern. This requires that the behaviour of all the units in all the layers be specified in advance, and that the network can only work effectively on the class of patterns on which it was trained. Training for the digits will require a different procedure from training for ten alphabetic characters.

Second, the training is replicated across the input array. This is most easily explained in relation to the US1 layer. There are 12 units in this layer that correspond to each unit in the input (U0) layer. The intention of the training process is that the twelve units at each position respond to the same set of input features. To ensure this, changes to the weights are determined for a single position in the input array, say at (x0,y0), and applied to the weights at all positions in the next layer, namely at (x,y,z0), for all x and y. This method ensures that the network is insensitive to variations in the positions of the input patterns.

Once the network is trained, presentation of an input pattern should result in one of the units in the UC4 layer having the largest output value. This unit indicates which of the digits was presented as input.


next up previous contents
Next: Applications Up: Neocognitron Previous: The Network Equations
Mike Alder
9/19/1997