Jehoshaphat I. Abu


100 Days Of ML Code — Day 040

100 Days Of ML Code — Day 040

Jehoshaphat I. Abu's photo
Jehoshaphat I. Abu
·Aug 18, 2018·

3 min read

Recap From Day 039

In day 039, we continued with video features: Fiducials. We learned that “Fiducial is an object placed in the field of view of an imaging system which appears in the image produced, for use as a point of reference or a measure. It may be either something placed into or on the imaging subject, or a mark or set of marks in the reticle of an optical instrument.”

Today, we will continue from where we left off in day 039.

Video Features Continued

Haar Cascades

“Object Detection using Haar feature-based cascade classifiers is an effective object detection method proposed by Paul Viola and Michael Jones in their paper, “Rapid Object Detection using a Boosted Cascade of Simple Features” in 2001. It is a machine learning based approach where a cascade function is trained from a lot of positive and negative images. It is then used to detect objects in other images.” — OpenCV

A Haar Cascade is basically a classifier which is used to detect the object for which it has been trained for, from the source. The Haar Cascade is by superimposing the positive image over a set of negative images. The training is generally done on a server and on various stages. Better results are obtained by using high quality images and increasing the amount of stages for which the classifier is trained

The openCV library comes with a very nice method for object detection called haar cascades, that allows us to find certain types of objects and images without using any fiducials. It comes with a haar cascade method for finding humans faces, for instance, which we can find really useful.

The algorithm needs a lot of positive images (images of faces) and negative images (images without faces) to train the classifier. Then we need to extract features from it. For this, Haar features shown in the below image are used. Each feature is a single value obtained by subtracting sum of pixels under the white rectangle from sum of pixels under the black rectangle.


If we want to count how many people are standing in front of a camera, or use the position of someone’s face in front of the camera to control something, haar cascade is a great method. It’s fast enough to use for real-time control, and it doesn’t require any calibration or turning.

Haar cascades are actually created using the Adaboost classifier. They are first constructed by putting together a training set of many examples of an object we want to identify. For examples many different human faces. Then Adaboost applies a particular kind of weak learner that simply compares pixels in one region with pixels in one or two other regions.

Through many iterations in which different variations of weak learners are used, Adaboost ends up learning which type of patterns end up being useful for identifying the object. For example, it’ll learn that a weak learner that looks for a narrow, darker stripe on top of a brighter, lower stripe tends to be useful for identifying faces, because our deeper-set eyes will tend to have more shadows that our checks.


Likewise, a weak learner that looks for a block of brighter pixels to the left of a block of darker pixels to the right won’t be too helpful, since people don’t tend to have the left half of their face darker than the right half.

We could of course train up our own haar cascade to learn to recognize different objects if we had enough data points. Luckily, we can find pre-trained haar cascades for faces and a few other human features on the internet, including packaged up for us in OpenCV.

It’s good to know that you’re still here. We’ve come to the end of day 040. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, remain legendary.



Share this