Why is the partial derivative of a3 w.r.t. to w3 = f2? If I miss anything, please let me know the slides I can refer back to. Thank you!
motoole2
In this case, we want to compute the partial derivative of the loss with respect to $w_3$ (which is not the same as $f_2$, to be clear). Therefore, the last partial derivative in the chain will be computed with respect to $w_3$.
Why is the partial derivative of a3 w.r.t. to w3 = f2? If I miss anything, please let me know the slides I can refer back to. Thank you!
In this case, we want to compute the partial derivative of the loss with respect to $w_3$ (which is not the same as $f_2$, to be clear). Therefore, the last partial derivative in the chain will be computed with respect to $w_3$.