Computer Vision (CMU 16-385)

This page contains lecture slides and recommended readings for the Fall 2020 offering of 16-385.

(Overview of computer vision)
(Image transformations, point image processing, linear shift-invariant image filtering, convolution, image gradients)
Basic reading:
(Image downsampling, aliasing, Gaussian image pyramid, Laplacian image pyramid, Fourier series, frequency domain, Fourier transform, frequency-domain filtering, sampling)
Basic reading:
Additional reading:
(Finding boundaries, line fitting, line parameterization, Hough transform, Hough circles)
Basic reading:
(Visualizing quadratics, Harris corner detector, multi-scale detection)
Basic reading:
(Designing feature descriptors, MOPS descriptor, GIST descriptor, Histogram of Textons descriptor, HOG descriptor, SIFT)
Basic reading:
(2D transformations, projective geometry, classification of 2D transformations, determining unknown 2D transformations)
Basic reading:
Additional reading:
  • Hartley and Zisserman, "Multiple View Geometry in Computer Vision", Cambridge University Press 2004. A comprehensive treatment of all aspects of projective geometry relating to computer vision, and also a very useful reference for the second part of the class.
  • Richter-Gebert, "Perspectives on projective geometry", Springer 2011. A beautiful, thorough, and very accessible mathematics textbook on projective geometry (available online for free from CMU's library).
(Panoramas, Image homographies, Computing with homographies, direct linear transform (DLT), random sample consensus (RANSAC))
Basic reading:
Additional reading:
(Pinhole camera, accidental pinholes, camera matrix)
Basic reading:
Additional reading:
(Review of camera matrix, perspective, other camera models, pose estimation)
Basic reading:
(Triangulation, epipolar geometry, essential matrix, fundamental matrix, 8-point algorithm)
Basic reading:
(Revisiting triangulation, disparity, stereo rectification, stereo matching, improved stereo matching)
Basic reading:
(Appearance phenomena, measuring light and radiometry, reflectance and BRDF)
Basic reading:
  • Szeliski textbook, Section 2.2
  • Steven Gortler, Foundations of Computer Graphics, Chapter 21. This book has a great introduction to radiometry, reflectance, and their use for image formation.
(Notes about radiometry, the n-dot-l model, photometric stereo, uncalibrated photometric stereo, generalized bas-relief ambiguity, shape from shading)
Basic reading:
  • Szeliski textbook, Section 2.2
  • Steven Gortler, Foundations of Computer Graphics, Chapter 21. This book has a great introduction to radiometry, reflectance, and their use for image formation.
(Imaging sensor primer, color sensing in cameras, in-camera image processing pipeline, radiometric calibration)
Basic reading:
  • Szeliski textbook, Section 2.3
  • Michael Brown, "Understanding the In-Camera Image Processing Pipeline for Computer Vision," CVPR 2016, very detailed discussion of issues relating to color photography and management, slides available here.
  • Nine Degrees Below: amazing resource for color photography, reproduction, and management.
(Introduction to learning-based vision, image classification, bag-of-words, K-means clustering, classification, K-nearest neighbors, naive Bayes, support vector machines)
Basic reading:
(Perceptron, neural networks, training perceptrons, gradient descent, backpropagation, stochastic gradient descent)
Basic reading (No standard textbooks yet!):
(Some notes on optimization, convolutional neural networks, training ConvNets)
Basic reading (No standard textbooks yet!):
(Intro to vision for video, optical flow, constant flow, Horn-Schunck flow)
Basic reading:
(Motion magnification using optical flow, image alignment, Lucas-Kanade alignment, Baker-Matthews alignment, inverse alignment, KLT tracking, mean-shift tracking, modern trackers)
Basic reading:
(Segmentation, image as a graph, shortest graph paths and intelligent scissors, GrabCut)
Basic reading:
(Computational cameras, computational displays, light transport matrices)