Fully Connected

Forward

  • $z_i^t=\sum_j w_{ij}S^{(t-1)}_j$
  • $S^t_i=g(z_i)$

Backward

We know ${\delta E\over\delta S^t_i}$ from the child layer. So we can compute:

  • ${\delta E\over\delta z^t_i}={\delta S^t_i\over\delta z^t_i}{\delta E\over\delta S^t_i}
    =g'(z^t_i){\delta E\over\delta S^t_i}
    $
  • ${\delta E\over\delta S^{t-1}_i}
    =\sum_j{\delta z^t_j\over S^{t-1}_i}{\delta E\over\delta z^{t}_j}
    =\sum_j{w_{ji}}{\delta E\over\delta z^{t}_j}
    $
  • ${\delta E\over\delta w_{ij}}
    ={\delta z^t_i\over \delta w_{ij}}{\delta E\over \delta z^t_i}
    ={S_j^{(t-1)}}{\delta E\over \delta z^t_i}
    $