Network performance measures depend on the problem. If the network has to perform a classification task it is common to calculate the error as a percentage of correct classifications. It is possible to tolerate quite high errors in the output activations. If the network has to match a smooth function it may be most sensible to calculate the RMS error over all output units etc.
The most sensible way to progress is to save the output activations
together with target values for the test data and to write a little
program that does whatever testing is required. The files under
are just the ticket: Note that the output patterns are always
saved. The 'include output patterns' actually means 'include target
(!) patterns'.