Back Propogation



Consider the following model:
  input
  layer   layer 1    layer 2    layer l      Layer L
                                                    
    v  -----> O --------|      --> O           O ---> o
     1                  |         /                    1
                        O ...    /
                      / ^ 
    v  -----> O -----/  |          O           O ---> o
     2   .              |                         .    2
         .              |                         .
         .              |        \                .
    v  -----> O --------|     ---> O           O ---> o
     N                  .                              M
                        .
where there N inputs, vi, L layers, Nl nodes in each layer, l, M outputs, oi, and M target output values, ti. Each node in layer l has a linear activation function, sil = Ail'Vi, where Ail is a weight vector, and a transfer function, yil= fil(si). Note that the output of a node in a hidden layer is y rather than o.

Now define the following: For an output node,