Training custom SVM to use with HOGDescriptor in OpenCV
I wrote a child class of CvSVM to extract primal form after a linear svm is trained. Positive samples are labeled 1 and negative samples are labeled -1. It is strange that I have to put negative sign in front of alphas and leaving the sign of rho unchanged in order to get correct results from HogDescriptor.
LinearSVM.h
#ifndef LINEAR_SVM_H_
#define LINEAR_SVM_H_
#include <opencv2/core/core.hpp>
#include <opencv2/ml/ml.hpp>
class LinearSVM: public CvSVM {
public:
void getSupportVector(std::vector<float>& support_vector) const;
};
#endif /* LINEAR_SVM_H_ */
LinearSVM.cc
#include "linear_svm.h"
void LinearSVM::getSupportVector(std::vector<float>& support_vector) const {
int sv_count = get_support_vector_count();
const CvSVMDecisionFunc* df = decision_func;
const double* alphas = df[0].alpha;
double rho = df[0].rho;
int var_count = get_var_count();
support_vector.resize(var_count, 0);
for (unsigned int r = 0; r < (unsigned)sv_count; r++) {
float myalpha = alphas[r];
const float* v = get_support_vector(r);
for (int j = 0; j < var_count; j++,v++) {
support_vector[j] += (-myalpha) * (*v);
}
}
support_vector.push_back(rho);
}
OpenCV: how to use HOGDescriptor::detectMultiScale() with custom SVM?
I no longer have access to the original code.
To get around the issue, I wrote my own multiscale detector, which was less work than getting the primal SVM form.
My suggestion to people with similar issues now is to try upgrading to OpenCV 3.x.
OpenCV 3.4.1 Get Primal Form of Custom Trained Linear SVM HoG detectMultiScale
It turns out the answer is in the OpenCV test / examples train_HOG.cpp on Github.
It looks like this:
/// Get the SVM Detector in HoG Format
vector<float> getSVMDetector(const Ptr<SVM>& svm)
{
// get the support vectors
Mat sv = svm->getSupportVectors();
const int sv_total = sv.rows;
// get the decision function
Mat alpha, svidx;
double rho = svm->getDecisionFunction( 0, alpha, svidx );
CV_Assert( alpha.total() == 1 && svidx.total() == 1 && sv_total == 1 );
CV_Assert( (alpha.type() == CV_64F && alpha.at<double>(0) == 1.) ||
(alpha.type() == CV_32F && alpha.at<float>(0) == 1.f) );
CV_Assert( sv.type() == CV_32F );
vector< float > hog_detector( sv.cols + 1 );
memcpy( &hog_detector[0], sv.ptr(), sv.cols*sizeof( hog_detector[0] ) );
hog_detector[sv.cols] = (float)-rho;
return hog_detector;
}
How to train/use the HOGDescriptor class in OpenCV
HOG descriptor is very easy to implement. You can write your own code to do it. Look at http://smsoftdev-solutions.blogspot.com/2009/08/integral-histogram-for-fast-calculation.html.
It is fast implementation of HOG. Once you get HOG features of all the training images.You can train an SVM in OpenCV. Training with Gaussian Kernel has produced good results.
Apply HOG+SVM Training to Webcam for Object Detection
You already have three of the most important pieces available at your disposal. hoggify
creates a list of HOG descriptors - one for each image. Note that the expected input for computing the descriptor is a grayscale image and the descriptor is returned as a 2D array with 1 column which means that each element in the HOG descriptor has its own row. However, you are using np.squeeze
to remove the singleton column and replacing it with a 1D numpy array instead, so we're fine here. You would then use list_to_matrix
to convert the list into a numpy
array. Once you do this, you can use svmClassify
to finally train your data. This assumes that you already have your labels
in a 1D numpy
array. After you train your SVM, you would use the SVC.predict
method where given input HOG features, it would classify whether the image belonged to a chair or not.
Therefore, the steps you need to do are:
Use
hoggify
to create your list of HOG descriptors, one per image. It looks like the inputx
is a prefix to whatever you called your chair images as, whilez
denotes the total number of images you want to load in. Remember thatrange
is exclusive of the ending value, so you may want to add a+ 1
afterint(z)
(i.e.int(z) + 1
) to ensure that you include the end. I'm not sure if this is the case, but I wanted to throw it out there.x = '...' # Whatever prefix you called your chairs
z = 100 # Load in 100 images for example
lst = hoggify(x, z)Convert the list of HOG descriptors into an actual matrix:
data = list_to_matrix(lst)
Train your SVM classifier. Assuming you already have your labels stored in
labels
where a value0
denotes not a chair and1
denotes a chair and it is a 1Dnumpy
array:labels = ... # Define labels here as a numpy array
clf = svmClassify(data, labels)Use your SVM classifer to perform predictions. Assuming you have a test image you want to test with your classifier, you will need to do the same processing steps like you did with your training data. I'm assuming that's what
hoggify
does where you can specify a differentx
to denote different sets to use. Specify a new variablextest
to specify this different directory or prefix, as well as the number of images you need, then usehoggify
combined withlist_to_matrix
to get your features:xtest = '...' # Define new test prefix here
ztest = 50 # 50 test images
lst_test = hoggify(xtest, ztest)
test_data = list_to_matrix(lst_test)
pred = clf.predict(test_data)pred
will contain an array of predicted labels, one for each test image that you have. If you want, you can see how well your SVM did with the training data, so since you have this already at your disposal, just usedata
again from step #2:pred_training = clf.predict(data)
pred_training
will contain an array of predicted labels, one for each training image.
If you ultimately want to use this with a webcam, the process would be to use a VideoCapture
object and specify the ID of the device that is connected to your computer. Usually there's only one webcam connected to your computer, so use the ID of 0. Once you do this, the process would be to use a loop, grab a frame, convert it to grayscale as HOG descriptors require a grayscale image, compute the descriptor, then classify the image.
Something like this would work, assuming that you've already trained your model and you've created a HOG descriptor object from before:
cap = cv2.VideoCapture(0)
dim = 128 # For HOG
while True:
# Capture the frame
ret, frame = cap.read()
# Show the image on the screen
cv2.imshow('Webcam', frame)
# Convert the image to grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Convert the image into a HOG descriptor
gray = cv2.resize(gray, (dim, dim), interpolation = cv2.INTER_AREA)
features = hog.compute(gray)
features = features.T # Transpose so that the feature is in a single row
# Predict the label
pred = clf.predict(features)
# Show the label on the screen
print("The label of the image is: " + str(pred))
# Pause for 25 ms and keep going until you push q on the keyboard
if cv2.waitKey(25) == ord('q'):
break
cap.release() # Release the camera resource
cv2.destroyAllWindows() # Close the image window
The above process reads in an image, displays it on the screen, converts the image into grayscale so we can compute its HOG descriptor, ensures that the data is in a single row compatible for the SVM you trained and we then predict its label. We print this to the screen, and we wait for 25 ms before we read in the next frame so we don't overload your CPU. Also, you can quit the program at any time by pushing the q key on your keyboard. Otherwise, this program will loop forever. Once we finish, we release the camera resource back to the computer so that it can be made available for other processes.
Related Topics
How to Declare an Array of Objects Whose Class Has No Default Constructor
Std::Vector of Std::Vectors Contiguity
Arithmetic Right Shift Gives Bogus Result
What Is a Curly-Brace Enclosed List If Not an Intializer_List
How Performing Multiple Matrix Multiplications in Cuda
Is Msiopenproduct the Correct Way to Read Properties from an Installed Product
Why Does This If Condition Fail for Comparison of Negative and Positive Integers
How to Run Specific Test Cases in Googletest
How to Find an Official Reference Listing the Operation of Sse Intrinsic Functions
What Are Good Practices Regarding Shared Libraries on Linux
How to Know If a Type Is a Specialization of Std::Vector
Cast Vector<T> to Vector<Const T>
How to Deal with Global-Constructor Warning in Clang
What Is "Strip" (Gcc Application) Used For
Detecting Simulated Keyboard/Mouse Input