Chapter 12: Combining Inductive and Analytical Learning

Using Prior Knowledge to Alter the Search Objective

A training example must now be accompanied by its classification and one or more derivatives (a derivative can be taken with respect to a transformation)
Take a look at Figure 12.5
The error term is now modified to be E = Σ[(target classification - actual classification)² + μ Σ(target derivative - actual derivative)²]
μ is a constant that shows the relative importance of fitting the classifications vs. fitting the derivatives
See equation 12.1
Table 12.4 compares the performance of TangentProp and Backpropagation when TangentProp is supplied with the fact that the classification of a digit is invariant with respect to both vertical and horizontal translations
A drawback of TangentProp is that it is not robust to errors in the prior knowledge

Explanation-Based Neural Network
Computes the training derivative itself by explaining the example in terms of the domain theory and then extracting the derivative
Can vary the value of μ for each example
Given, training data of the form < x_i, f(x_i) >
Given, domain theory represented by a set of previously trained neural networks
Produce a new neural network to approximate the target function f
Take a look at Figure 12.7
The previously learned neural networks are used to calculate the derivatives
Derivatives that have a large magnitude indicate a highly relevant feature
If the domain theory explains the example well, μ should have a higher value
Thus, EBNN accommodates imperfect domain theory