What is the intuition behind moving in opposite direction of the gradient? From the next few slides, I understand that gradient means the difference in loss function per each change in one unit of weight. Since we want to minimize the loss function, we need to move in the opposite direction to cancel out the change.
motoole2
The gradient of a function points in the direction of steepest ascent. For example, in this slide, we could represent the landscape using a function $f(x,y)$. The derivative, given by $[df/dx, df/dy] = [u, v]$, represents a 2d direction that maximizes the change in the value of $f(x,y)$. In our case, we want to minimize the function, and choose to step in directions opposite of the gradient (i.e., in the steepest descent direction).
What is the intuition behind moving in opposite direction of the gradient? From the next few slides, I understand that gradient means the difference in loss function per each change in one unit of weight. Since we want to minimize the loss function, we need to move in the opposite direction to cancel out the change.
The gradient of a function points in the direction of steepest ascent. For example, in this slide, we could represent the landscape using a function $f(x,y)$. The derivative, given by $[df/dx, df/dy] = [u, v]$, represents a 2d direction that maximizes the change in the value of $f(x,y)$. In our case, we want to minimize the function, and choose to step in directions opposite of the gradient (i.e., in the steepest descent direction).