Previous | Next --- Slide 194 of 217
Back to Lecture Thumbnails
ADog

I'm a little confused about the input to the weighting function k. I have a few questions:

  1. Is this assuming that the pixel coordinates $x_n$ are centered around (0,0)?
  2. If the answer to the question above is yes, then it would also be equivalent to do $y + x_n$ instead of $y - x_n$? If the answer to above is no, why are we doing $y - x_n$?
  3. I remember in class today that the point of the bandwidth is to control/weight what range of pixels are searched. But if our goal is to find the location s.t. p(y) and q are the most similar, why isn't there a bandwidth for q in the previous slide? Doesn't the absence of the h in the previous slide mean that even if all the pixels were exactly aligned between p(y) and q, they would still be different because the distances are weighted differently, which could introduce error?

Thanks!

motoole2
  1. Not quite. $x_n$ are pixel coordinates. In the previous slide, it was implicitly assumed that the center of the target image was $(0,0)$. But here, $x_n$ does not necessary need to be centered around $(0,0)$.

  2. To be clear, we're weighing pixel values according to a function $k(r)$ where $r$ is proportional to the distance between two pixel coordinates $x_n$ and $y$. $y - x_n$ is correct here; $y + x_n$ would not be correct (doesn't measure distance).

  3. There are two things to be tuned here. There's the definition of the profile $k$ itself, and there's also the value used for $h$. Both have an impact here, and are chosen according to (i) the size of the object being tracked and (ii) the speed of the object moving.

Let's take a quick look at this slide, describing our kernel density estimate (KDE) of our objective function. If $h$ is too small, then our estimate of the PDF is going to be too "peaky", and there's a good chance of getting stuck in local minima. If $h$ is too large, then our estimate of the PDF is going to be too smooth, resulting in less accurate tracking.