Algorithm from the Mitchell book p98, Table 4.2

- Begins by constructing a network with the desired number of hidden and output units and initializing all network weights to small random values
- The main loop of the algorithm then repeatedly iterates over the training examples.
- For each training examples, it applies the network to the example, calcuates the error of the network output, computes the gradient with respect to the error, the updates all weights in the network
- Repeated until the network performs acceptably well
|