I understand that a big C allows the hyperplane to 'wiggle' a lot giving it more flexibility. However, Im confused as to why this means a small margin.
mpotoole
@raymondx It's the opposite, actually! A small $C$ allows the hyperplane to wiggle (producing larger margins). Let me explain.
Consider the extreme case where $C = 0$. In this scenario, the solution to our optimization problem is $\xi_i = \infty$ and $w = 0$. This is because the objective does not depend on the value of $\xi_i$, and setting $\xi_i$ to infinity gives me the most slack.
When $C = \infty$, this forces $\xi_i$ to have a value of 0 (otherwise, the objective would also have a value of $\infty$). In this scenario, this reduces our optimization formulation to the one shown on this slide.
I understand that a big C allows the hyperplane to 'wiggle' a lot giving it more flexibility. However, Im confused as to why this means a small margin.
@raymondx It's the opposite, actually! A small $C$ allows the hyperplane to wiggle (producing larger margins). Let me explain.
Consider the extreme case where $C = 0$. In this scenario, the solution to our optimization problem is $\xi_i = \infty$ and $w = 0$. This is because the objective does not depend on the value of $\xi_i$, and setting $\xi_i$ to infinity gives me the most slack.
When $C = \infty$, this forces $\xi_i$ to have a value of 0 (otherwise, the objective would also have a value of $\infty$). In this scenario, this reduces our optimization formulation to the one shown on this slide.