Python+Detailed Explanation of OpenCV Face Detection Principle and Example

About OpenCV

OpenCV is an open-source computer vision library from Intel (Computer Version). It consists of a series of C functions and a small amount of C++ class composition, which implements many general algorithms in image processing and computer vision.

OpenCV owns including 300+ C functions of cross-platform medium and high-level API. It does not depend on other external libraries - although some external libraries can also be used. OpenCV is free for both non-commercial and commercial applications. At the same time, OpenCV provides access to hardware, which can directly access the camera, and opencv also provides a simple GUI (graphics user interface) system: highgui. We will construct this face detection (face detection. Program.

opencv's python wrapper　

OpenCV itself has C/C++ written, if it needs to be used in other languages, we can wrap its dynamic link library file, fortunately, there are many such wrappers in Python, and this article uses Cvtypes.

In fact, many packages in Python are from third parties, such as PIL (Python Image Library), which is a graphics processing package implemented in C language and wrapped in Python, these wrappers allow you to use these APIs just like Python's built-in functions.

Face Detection Principle

Face detection belongs to a part of object detection (object detection), mainly involving two aspects

1. First, perform probability statistics on the target object to be detected to know some features of the object to be detected and establish a target detection model.
2. Match the obtained model with the input image, and output the matched area if there is a match, otherwise do nothing.　

Computer Vision

The visual system of a computer is quite different from that of a human eye, but there are also similarities. The human eye can see objects by stimulating the photoreceptor cells in the eye with light reflected from the object, and then forming an image of the object in the brain through the visual nerve. The things a computer sees through a camera are much simpler, to put it simply, they are a set of matrices composed of numbers. These numbers indicate the intensity of the light emitted by the object, and the photoelectric elements of the camera convert the light signal into a digital signal, quantifying it into a matrix.

How to draw the conclusion "This is a face" from these numbers is a relatively complex matter. The physical world is colorful, generally speaking, color images in computers are accumulated from several color channels, such as RGB mode images, which have red channels (Red), green channels (Green), and blue channels (Blue). These three channels are grayscale images, such as a point by8bit to represent, then one channel can represent2^8=256grayscale. After adding the three channels together, they can represent3*8=24bit color, which is what we commonly say24true color

It is undoubtedly a complex task to process such images, so it is necessary to first convert the color image to a grayscale image, which can reduce the amount of data (for example, in RGB mode, it can be reduced to the original image's1/3), and some noise signals can be removed at the same time. First, convert the image to a grayscale image, and then increase the contrast of this grayscale image, so that the dark parts of the image become darker and the bright parts become brighter. After this treatment, the image is easier to be recognized by the algorithm.

Harr feature cascade table

OpenCV uses a cascade table of haar features for object detection, which includes boost classifiers. First, people use the haar features of the samples to train the classifier to get a cascade of boost classifiers. The training method includes two aspects:

1Positive example sample, that is, the target sample to be detected
2Example sample, any other image

First, unify these images into the same size, which is called normalization, and then perform statistics. Once the classifier is established, it can be used to detect the interesting areas in the input image, generally speaking, the input image is larger than the sample, so the search window needs to be moved to retrieve different sizes of targets. The classifier can change its size proportionally to scan the input image multiple times.

What is a cascade classifier? A cascade classifier is a large classifier formed by cascading several simple classifiers, and the detected window passes through each classifier in turn. If the window passes through all the classifiers, it can be determined as a target area. At the same time, in order to consider the efficiency issue, the most strict classifier can be placed at the top of the entire cascade classifier, which can reduce the number of matches.

The basic classifier takes haar features as input, with 0/1For output, 0 represents a non-match,1represent the match.

Haar features

　Boundary features, including four types
　Linear features, including8type
　The center focuses on features, including two types

When scanning the image to be detected, taking the boundary feature (a) as an example, as mentioned before, an image in a computer is a matrix composed of numbers, the program first calculates the grayscale value x of the entire window, then calculates the black grayscale value y in the rectangular frame, and then calculates (x-2y) value, the obtained value is compared with x, and if this ratio is within a certain range, it indicates that the current scanning area of the image to be detected meets the boundary feature (a), and then continue scanning.

A more detailed description of this algorithm is beyond the scope of this article, and more information can be obtained from the reference resources.

Non-fixed size object detection

Since it is based on video stream object detection, we are unlikely to know the size of the target in advance, which requires our cascade table classifier to have the ability to increase (or decrease) proportionally. This way, when a small window moves across the entire image to be detected without finding a target, we can adjust the size of the classifier and continue detection until a target is detected or the size of the window is comparable to the size of the image to be detected.

Step 1:Image preprocessing

After obtaining a frame (an image) from the camera, we need to preprocess this image first:
1.Convert the image from RGB mode to grayscale and then convert the grayscale image
2.Perform histogram equalization on the grayscale image

These two steps are very simple in OpenCV:

image_size = cv.cvGetSize(image)# Get the original image size 
grayscale = cv.cvCreateImage(image_size, 8, 1)# Create an empty grayscale image 
cv.cvCvtColor(image, grayscale, cv.CV_BGR2GRAY)# Conversion 
storage = cv.cvCreateMemStorage(0)# Create a new storage area for later use 
cv.cvClearMemStorage(storage) 
cv.cvEqualizeHist(grayscale, grayscale)# Histogram equalization of the grayscale image

Step 2:Detect and mark the target

In OpenCV, the model for face detection has been established as an XML file, which contains the training results of the classifier mentioned above, and we can skip the process of building the cascade table by loading this file. With the cascade table, we just need to pass the image to be detected and the cascade table to the OpenCV object detection algorithm to get a set of detected faces.

# detect objects 
cascade = cv.cvLoadHaarClassifierCascade('haarcascade_frontalface_alt.xml', 
      cv.cvSize(1,1))]} 
faces = cv.cvHaarDetectObjects(grayscale, cascade, storage, 1.2, 2, 
    cv.CV_HAAR_DO_CANNY_PRUNING, 
    cv.cvSize(50, 50))#Set the minimum face to50*50 pixels 
if faces: 
 print 'face detected here', cv.cvGetSize(grayscale) 
 for i in faces: 
 cv.cvRectangle(image, cv.cvPoint( int(i.x), int(i.y)), 
   cv.cvPoint(int(i.x + i.width), int(i.y + i.height)), 
   cv.CV_RGB(0, 255, 0), 1, 8, 0)#Draw a green rectangle

Step 3:Draw video window using highgui

highgui.cvNamedWindow ('camera', highgui.CV_WINDOW_AUTOSIZE) 
highgui.cvMoveWindow ('camera', 50, 50) 
highgui.cvShowImage('camera', detimg)

As can be seen, the OpenCV API is quite clear, and using Python's wrapper can make the code very concise. Well, let's see the running results of the program:　

Since the video stream is dynamic, we can use an infinite loop at the entry of the program. In the loop, we read a frame from the video each time, pass this frame to the face detection module, where the module marks the frame (if there is a face), and then returns the frame. The main program takes this frame and updates the display window.

Other features of OpenCV

Laplacian edge detection

def laplaceTransform(image): 
 laplace = None 
 colorlaplace = None 
 planes = [None, None, None] 
 image_size = cv.cvGetSize(image) 
 if not laplace: 
 for i in range(len(planes)): 
  planes[i] = cv.cvCreateImage(image_size, 8, 1) 
 laplace = cv.cvCreateImage(image_size, cv.IPL_DEPTH_16S, 1) 
 colorlaplace = cv.cvCreateImage(image_size, 8, 3) 
 cv.cvSplit(image, planes[0], planes[1], planes[2], None) 
 for plane in planes: 
 cv.cvLaplace(plane, laplace, 3) 
 cv.cvConvertScaleAbs(laplace, plane, 1, 0) 
 cv.cvMerge(planes[0], planes[1], planes[2], None, colorlaplace) 
 colorlaplace.origin = image.origin 
 return colorlaplace

Effect diagram:

CVtypes comes with an example of a histogram of image color spaces:

Conclusion

OpenCV has very powerful functions and provides a large number of algorithm implementations. The content involved in the article is only a small part of computer vision. Readers can consider marking the collected faces to achieve face recognition of specific individuals. Or consider migrating face detection to the network to achieve remote monitoring. Imagine that machines that were originally lifeless can be made to look like they have thoughts through our own thoughts and actions, which is very interesting in itself.

That's all for this article. I hope it will be helpful to everyone's learning and also hope everyone will support the Naiya Tutorial.

Declaration: The content of this article is from the Internet, and the copyright belongs to the original author. The content is contributed and uploaded by Internet users spontaneously. This website does not own the copyright, has not been manually edited, and does not assume any relevant legal liability. If you find any content suspected of copyright infringement, please feel free to send an email to: notice#w3If you find any copyright infringement, please send an email to notice#w with the # replaced by @ and provide relevant evidence. Once verified, this site will immediately delete the content suspected of infringement.

Basic Tutorial