OpenCV-Python uses OCR handwriting data set to run SVM | fifty-six

target

In this chapter, we will recognize the handwritten dataset again, but use SVM instead of kNN.

Recognize handwritten numbers

In kNN, we directly use pixel intensity as feature vector. This time we will use the directional gradient histogram (HOG) as the feature vector.

Here, we use the second-order moment to correct the skew of the image before finding the HOG. Therefore, we first define a function deskew (), which takes a digital image and corrects it. Here is the deskew() function:

def deskew(img):
    m = cv.moments(img)
    if abs(m['mu02']) < 1e-2:
        return img.copy()
    skew = m['mu11']/m['mu02']
    M = np.float32([[1, skew, -0.5*SZ*skew], [0, 1, 0]])
    img = cv.warpAffine(img,M,(SZ, SZ),flags=affine_flags)
    return img

The following figure shows the up offset correction function applied to a zero image. The left image is the original image, and the right image is the offset corrected image.

Next, we have to find the HOG descriptor for each cell. For this reason, we find the Sobel derivatives of each element in the X and Y directions. Then find their size and gradient direction at each pixel. The gradient is quantized to 16 integer values. Divide the image into four sub squares. For each sub square, the histogram (16 bin) of weight size direction is calculated. So each sub square gives you a vector of 16 values. Four such vectors (of four sub squares) together provide us with a feature vector containing 64 values. This is the eigenvector we use to train the data.

def hog(img):
    gx = cv.Sobel(img, cv.CV_32F, 1, 0)
    gy = cv.Sobel(img, cv.CV_32F, 0, 1)
    mag, ang = cv.cartToPolar(gx, gy)
    bins = np.int32(bin_n*ang/(2*np.pi))    # quantizing binvalues in (0...16)
    bin_cells = bins[:10,:10], bins[10:,:10], bins[:10,10:], bins[10:,10:]
    mag_cells = mag[:10,:10], mag[10:,:10], mag[:10,10:], mag[10:,10:]
    hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
    hist = np.hstack(hists)     # hist is a 64 bit vector
    return hist

Finally, as before, we first split the big data set into individual cells. For each number, 250 units are reserved for training data and the remaining 250 are reserved for testing. The complete code is as follows, you can also download it from here:

#!/usr/bin/env python
import cv2 as cv
import numpy as np
SZ=20
bin_n = 16 # Number of bins
affine_flags = cv.WARP_INVERSE_MAP|cv.INTER_LINEAR
def deskew(img):
    m = cv.moments(img)
    if abs(m['mu02']) < 1e-2:
        return img.copy()
    skew = m['mu11']/m['mu02']
    M = np.float32([[1, skew, -0.5*SZ*skew], [0, 1, 0]])
    img = cv.warpAffine(img,M,(SZ, SZ),flags=affine_flags)
    return img
def hog(img):
    gx = cv.Sobel(img, cv.CV_32F, 1, 0)
    gy = cv.Sobel(img, cv.CV_32F, 0, 1)
    mag, ang = cv.cartToPolar(gx, gy)
    bins = np.int32(bin_n*ang/(2*np.pi))    # quantizing binvalues in (0...16)
    bin_cells = bins[:10,:10], bins[10:,:10], bins[:10,10:], bins[10:,10:]
    mag_cells = mag[:10,:10], mag[10:,:10], mag[:10,10:], mag[10:,10:]
    hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
    hist = np.hstack(hists)     # hist is a 64 bit vector
    return hist
img = cv.imread('digits.png',0)
if img is None:
    raise Exception("we need the digits.png image from samples/data here !")
cells = [np.hsplit(row,100) for row in np.vsplit(img,50)]
# First half is trainData, remaining is testData
train_cells = [ i[:50] for i in cells ]
test_cells = [ i[50:] for i in cells]
deskewed = [list(map(deskew,row)) for row in train_cells]
hogdata = [list(map(hog,row)) for row in deskewed]
trainData = np.float32(hogdata).reshape(-1,64)
responses = np.repeat(np.arange(10),250)[:,np.newaxis]
svm = cv.ml.SVM_create()
svm.setKernel(cv.ml.SVM_LINEAR)
svm.setType(cv.ml.SVM_C_SVC)
svm.setC(2.67)
svm.setGamma(5.383)
svm.train(trainData, cv.ml.ROW_SAMPLE, responses)
svm.save('svm_data.dat')
deskewed = [list(map(deskew,row)) for row in test_cells]
hogdata = [list(map(hog,row)) for row in deskewed]
testData = np.float32(hogdata).reshape(-1,bin_n*4)
result = svm.predict(testData)[1]
mask = result==responses
correct = np.count_nonzero(mask)
print(correct*100.0/result.size)

This special method gives us nearly 94% accuracy. You can try different values for various parameters of SVM to check whether higher precision can be achieved. Or, you can read technical papers about the field and try to implement them.

Additional resources

Histograms of Oriented Gradients Video: https://www.youtube.com/watch?v=0Zib1YEE4LU

Practice

The OpenCV example includes digits.py, which makes some improvements to the above methods to get improved results. It also contains references. Check and understand it.

Welcome to pioneer AI blog: http://panchuang.net/

OpenCV official document in Chinese: http://woshicver.com/

Welcome to pioneer blog Resource Hub: http://docs.panchuang.net/

Posted by gurjit on Sun, 29 Mar 2020 00:05:02 -0700

Programmer Group

OpenCV-Python uses OCR handwriting data set to run SVM | fifty-six

target

Recognize handwritten numbers

Additional resources

Practice

Hot Keywords