Let be a vector from the space
, where N is the sum of
the number of weights and of the number of biases of the network. Let E be
the error function we want to minimize.
SCG differs from other CGMs in two points:
and adjusting at each iteration. This is
the main contribution of SCG to both fields of neural learning and
optimization theory.
SCG has been shown to be considerably faster than standard backpropagation and than other CGMs [Mol93].