The dark side of the code...: Programming Computer Vision with Python

What a great book!!!!!

I know I'm not impartial because I like computer vision, but this book really is a big supplement to computer vision classes you can have in college. The choice of the language used is also great for both the amount of programmers that use it, that also for the fluency of the programs created. Has to be mentioned that this book is for intermediate students how wants to do some other advanced stuff of computer vision usually not covered by specific classes. If you are kind of a mathematical person who like formulas and demostrations and wants to know more about the theory behind some example, it would be preferable complement this book with others (I remember Multiple View Geometry for example) that explains much better details the theory used.

Below you can find a brief explaination of each chapter in the book:

Chapter 1 - Basic Image Handling and Processing
This chapter simply explains what functionality are available for handle images using python and the PIL library provided. It also explains the other libraries used as Numpy, Scipy, Matplotlib

Chapter 2 - Local Image Descriptors
After a brief explanation of what are local image descriptor, the chapter explains the Harris and the Sift descriptors. The example of this chapter is to create a graph of relation between geotagged information retrived on panoramio.

Chapter 3 - Image to Image Mappings
In this chapter, homographies and affine transformation are explained. The method of Direct Linear Transformation Algorithm to find the homography based on correspondent points is explained. In case of affine transformation, fasted way to find the matrix with less correspondencies are described. In this chapter the example guide the lecturer to create a panorama creation system based on multiple images. As contour of this example RANSAC and Delaunay triangulation algorithms are briefly explained.

Chapter 4 - Cameras and Augmented Reality
Really interesting chapter on modeling cameras through its projection properties and use this properties to create an augmented reality application. Camera Calibration is also explained in this chapter that is completed by an example of augmented reality using PyOpenGL to load a model in the image correctly aligned to a reference image.

Chapter 5 - Multiple View Geometry
Chapter about epipolar geometry and 3D recostruction using multiple images and different sets of matrices available to describe their relation. It also explain theory behind the stereo images system and concludes the chapter and example of 3D recostruction.

Chapter 6 - Clustering Images
This chapter introduces some clustering methods (as K-means) and explains how to use them to find relations between groups of images. Techniques as Hierarchical and Spectral clustering are explained and using to divide images into related groups.

Chapter 7 - Searching Images
How to find an images in the fast way is a problem that has to be solved in application with an huge knowledge base and sometimes the only use of reduction techniques as PCA is not enough. In this chapter techniques derived from text mining are performed to find "visual" words inside images that can be easily search in a "vocabulary"

Chapter 8 - Classify Image Content
Classification is the process of deciding to which class a sample belongs based on some measurment executed on it. Starting explaing the K-nearest algorithm, the chapter continues describing a method to use Dense Sift feature as classification measure together with Bayes classifiers and SVM classifiers. The chapter ends with an interesting example on resolving sudoku's after have classified all the fixed numbers.

Chapter 9 - Image Segmentation
The segmentation of an image consist in separate regions of the image that are likely to have a similar mean (ex. Background vs Foreground). The techniques used are Graph Cuts, Segmentation with clustering and Variational methods

Chapter 10 - OpenCV
Very straightforward chapter about the functionality provided by the opencv library.

To conclude, this book is actually great. For me that I am interested in computer vision, this book explained the newest and coolest methods that can be used to build powerful and appealing application in the era of powerfull smartphones and huge images datasets freely available.

The dark side of the code...

Friday, August 24, 2012

Programming Computer Vision with Python - Book Review

No comments:

Post a Comment