Previous | Next --- Slide 72 of 130
Back to Lecture Thumbnails
tpbui

What is the intuition behind moving in opposite direction of the gradient? From the next few slides, I understand that gradient means the difference in loss function per each change in one unit of weight. Since we want to minimize the loss function, we need to move in the opposite direction to cancel out the change.

motoole2

The gradient of a function points in the direction of steepest ascent. For example, in this slide, we could represent the landscape using a function $f(x,y)$. The derivative, given by $[df/dx, df/dy] = [u, v]$, represents a 2d direction that maximizes the change in the value of $f(x,y)$. In our case, we want to minimize the function, and choose to step in directions opposite of the gradient (i.e., in the steepest descent direction).