Plotting data from an svm fit - hyperplane
You wrote:
I used svm to find a hyperplane best fit regression
But according to:
Call:
svm(formula = q ~ ., data = data, kernel = "linear")
Parameters:
SVM-Type: C-classification
you are doing classification.
So, first of all decide what you need: to classify or to fit regression, from ?svm
, we see:
type: ‘svm’ can be used as a classification machine, as a
regression machine, or for novelty detection. Depending of
whether ‘y’ is a factor or not, the default setting for
‘type’ is ‘C-classification’ or ‘eps-regression’,
respectively, but may be overwritten by setting an explicit
value.
As I believe you didn't change the parameter type
from its default value, you are probably solving classification
, so, I will show how to visualize this for classification.
Let's assume there are 2
classes, generate some data:
> require(e1071) # for svm()
> require(rgl) # for 3d graphics.
> set.seed(12345)
> seed <- .Random.seed
> t <- data.frame(x=runif(100), y=runif(100), z=runif(100), cl=NA)
> t$cl <- 2 * t$x + 3 * t$y - 5 * t$z
> t$cl <- as.factor(ifelse(t$cl>0,1,-1))
> t[1:4,]
x y z cl
1 0.7209039 0.2944654 0.5885923 -1
2 0.8757732 0.6172537 0.8925918 -1
3 0.7609823 0.9742741 0.1237949 1
4 0.8861246 0.6182120 0.5133090 1
Since you want kernel='linear'
the boundary must be w1*x + w2*y + w3*z - w0
- hyperplane.
Our task divides to 2 subtasks: 1) to evaluate equation of this boundary plane 2) draw this plane.
1) Evaluating the equation of boundary plane
First, let's run svm()
:
> svm_model <- svm(cl~x+y+z, t, type='C-classification', kernel='linear',scale=FALSE)
I wrote here explicitly type=C-classification
just for emphasis we want do classification.scale=FALSE
means that we want svm()
to run directly with provided data without scaling data (as it does by default). I did it for future evaluations that become simpler.
Unfortunately, svm_model
doesn't store the equation of boundary plane (or just, normal vector of it), so we must evaluate it. From svm-algorithm we know that we can evaluate such weights with following formula:
w <- t(svm_model$coefs) %*% svm_model$SV
The negative intercept is stored in svm_model
, and accessed via svm_model$rho
.
2) Drawing plane.
I didn't find any helpful function plane3d
, so, again we should do some handy work. We just take grid of pairs (x,y)
and evaluate the appropriate value of z
of the boundary plane.
detalization <- 100
grid <- expand.grid(seq(from=min(t$x),to=max(t$x),length.out=detalization),
seq(from=min(t$y),to=max(t$y),length.out=detalization))
z <- (svm_model$rho- w[1,1]*grid[,1] - w[1,2]*grid[,2]) / w[1,3]
plot3d(grid[,1],grid[,2],z) # this will draw plane.
# adding of points to the graphics.
points3d(t$x[which(t$cl==-1)], t$y[which(t$cl==-1)], t$z[which(t$cl==-1)], col='red')
points3d(t$x[which(t$cl==1)], t$y[which(t$cl==1)], t$z[which(t$cl==1)], col='blue')
We did it with rgl
package, you can rotate this image and enjoy it :)
How to plot svm hyperplane with only one feature
It's an interesting problem. On the surface it's very simple — one feature means one dimension, hence the hyperplane has to be 0-dimensional, i.e. a point. Yet what scikit-learn gives you is a line. So the question is really how to turn this line into a point.
I've spent about an hour looking for answers in the documentation for scikit-learn, but there is simply nothing on 1-d SVM classifiers (probably because they are not practical). So I decided to play with the sample code below to see if I can figure out the answer:
from sklearn import svm
n_samples = 100
X = np.concatenate([np.random.normal(0,0.1,n_samples), np.random.normal(10,0.1,n_samples)]).reshape(-1,1)
y = np.array([0]*n_samples+[1]*n_samples)
clf = svm.LinearSVC(max_iter = 10000)
clf.fit(X,y)
slope = clf.coef_
intercept = clf.intercept_
print(slope, intercept)
print(-intercept/slope)
X
is the array of samples such that the first 100 points are sampled from N(0,0.1), and the next 100 points are sampled from N(10,0.1). y
is the array of labels (100 of class '0' and 100 of class '1'). Intuitively it's clear the hyperplane should be halfway between 0 and 10.
Once you fit the classifier, you find out the intercept is about -0.96 which is nowhere near where the 0-d hyperplane (i.e. a point) should be. However, if you take y=0
and back-calculate x
, it will be pretty close to 5. Now try changing the means of the distributions that make up X, and you will find that the answer is always -intercept/slope
. That's your 0-d hyperplane (point) for the classifier.
So to visualize, you merely need to plot your data on a number line (use different colours for the classes), and then plot the boundary obtained by dividing the negative intercept by the slope. I'm not sure how to plot a number line, but you can always resort to a scatter plot with all y
coordinates set to 0.
Plot hyperplane Linear SVM python
What about leaving the support_
out, which is not defined for a LinearSVC
?
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
np.random.seed(0)
X = np.r_[np.random.randn(20, 2) - [2, 2], np.random.randn(20, 2) + [2, 2]]
Y = [0] * 20 + [1] * 20
fig, ax = plt.subplots()
clf2 = svm.LinearSVC(C=1).fit(X, Y)
# get the separating hyperplane
w = clf2.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(-5, 5)
yy = a * xx - (clf2.intercept_[0]) / w[1]
# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx2, yy2 = np.meshgrid(np.arange(x_min, x_max, .2),
np.arange(y_min, y_max, .2))
Z = clf2.predict(np.c_[xx2.ravel(), yy2.ravel()])
Z = Z.reshape(xx2.shape)
ax.contourf(xx2, yy2, Z, cmap=plt.cm.coolwarm, alpha=0.3)
ax.scatter(X[:, 0], X[:, 1], c=Y, cmap=plt.cm.coolwarm, s=25)
ax.plot(xx,yy)
ax.axis([x_min, x_max,y_min, y_max])
plt.show()
How to plot a hyper plane in 3D for the SVM results?
Here is a function to plot 3D SVM results in MATLAB.
function [] = svm_3d_matlab_vis(svmStruct,Xdata,group)
sv = svmStruct.SupportVectors;
alphaHat = svmStruct.Alpha;
bias = svmStruct.Bias;
kfun = svmStruct.KernelFunction;
kfunargs = svmStruct.KernelFunctionArgs;
sh = svmStruct.ScaleData.shift; % shift vector
scalef = svmStruct.ScaleData.scaleFactor; % scale vector
group = group(~any(isnan(Xdata),2));
Xdata =Xdata(~any(isnan(Xdata),2),:); % remove rows with NaN
% scale and shift data
Xdata1 = repmat(scalef,size(Xdata,1),1).*(Xdata+repmat(sh,size(Xdata,1),1));
k = 50;
cubeXMin = min(Xdata1(:,1));
cubeYMin = min(Xdata1(:,2));
cubeZMin = min(Xdata1(:,3));
cubeXMax = max(Xdata1(:,1));
cubeYMax = max(Xdata1(:,2));
cubeZMax = max(Xdata1(:,3));
stepx = (cubeXMax-cubeXMin)/(k-1);
stepy = (cubeYMax-cubeYMin)/(k-1);
stepz = (cubeZMax-cubeZMin)/(k-1);
[x, y, z] = meshgrid(cubeXMin:stepx:cubeXMax,cubeYMin:stepy:cubeYMax,cubeZMin:stepz:cubeZMax);
mm = size(x);
x = x(:);
y = y(:);
z = z(:);
f = (feval(kfun,sv,[x y z],kfunargs{:})'*alphaHat(:)) + bias;
t = strcmp(group, group{1});
% unscale and unshift data
Xdata1 =(Xdata1./repmat(scalef,size(Xdata,1),1)) - repmat(sh,size(Xdata,1),1);
x =(x./repmat(scalef(1),size(x,1),1)) - repmat(sh(1),size(x,1),1);
y =(y./repmat(scalef(2),size(y,1),1)) - repmat(sh(2),size(y,1),1);
z =(z./repmat(scalef(3),size(z,1),1)) - repmat(sh(3),size(z,1),1);
figure
plot3(Xdata1(t, 1), Xdata1(t, 2), Xdata1(t, 3), 'b.');
hold on
plot3(Xdata1(~t, 1), Xdata1(~t, 2), Xdata1(~t, 3), 'r.');
hold on
% load unscaled support vectors for plotting
sv = svmStruct.SupportVectorIndices;
sv = [Xdata1(sv, :)];
plot3(sv(:, 1), sv(:, 2), sv(:, 3), 'go');
legend(group{1},group{end},'support vectors')
x0 = reshape(x, mm);
y0 = reshape(y, mm);
z0 = reshape(z, mm);
v0 = reshape(f, mm);
[faces,verts,colors] = isosurface(x0, y0, z0, v0, 0, x0);
patch('Vertices', verts, 'Faces', faces, 'FaceColor','k','edgecolor', 'none', 'FaceAlpha', 0.5);
grid on
box on
view(3)
hold off
end
Example plot:
% load data
load fisheriris;
% train svm using three features for two species
svmStruct = svmtrain(meas(1:100,1:3),species(1:100),'showplot','false','kernel_function','rbf',...
'boxconstraint',1,'kktviolationlevel',0.05,'tolkkt',5e-3);
% run function described above
svm_3d_matlab_vis(svmStruct,meas(1:100,1:3),species(1:100))
Related Topics
Efficient Alternatives to Merge for Larger Data.Frames R
Reading Hdf Files into R and Converting Them to Geotiff Rasters
How to Reference the Local Environment Within a Function, in R
Solving for the Inverse of a Function in R
Run a Bash Script from an R Script
Check If R Is Running in Rstudio
Generate Numbers with Specific Correlation
Regular Analysis Over Irregular Time Series
Creating R Package, Warning: Package '---' Was Built Under R Version 3.1.2
Working with Dictionaries/Lists to Get List of Keys
Best Practices for Storing and Using Data Frames Too Large for Memory
Marking Specific Tiles in Geom_Tile()/Geom_Raster()
How to Change the Resolution of a Raster Layer in R
Include Data Examples in Developing R Packages