How to plot predicted values vs the true value?
The problem is that the range of your values span from about 0 to 60.000.
I would suggest two options:
Either you convert both axis to a log-scale
g=plt.scatter(y_test1, y_pred_test_Forestreg)
g.axes.set_yscale('log')
g.axes.set_xscale('log')
g.axes.set_xlabel('True Values ')
g.axes.set_ylabel('Predictions ')
g.axes.axis('equal')
g.axes.axis('square')
Or, even better, Plot the difference between the true and predicted values (i.e. the prediction errors).
g=plt.plot(y_test1 - y_pred_test_Forestreg,marker='o',linestyle='')
How to plot a graph of actual vs predict values in python?
The problem you seem to have is that you mix y_test
and y_pred
into one "plot" (meaning here the scatter()
function)
Using scatter()
or plot()
function (which you also mixed up), the first parameter are the coordinates on the x-axis and the second parameter are the coordinates on the y-axis.
So 1.) you need to one scatter()
with only y_test
and then one with only y_pred
. To do this you 2.) need either to have 2D data, or as it seems to be in your case, just use indexes for the x-axis by using the range()
functionality.
Here is some code with random data, that might get you started:
import matplotlib.pyplot as plt
import numpy as np
def plotGraph(y_test,y_pred,regressorName):
if max(y_test) >= max(y_pred):
my_range = int(max(y_test))
else:
my_range = int(max(y_pred))
plt.scatter(range(len(y_test)), y_test, color='blue')
plt.scatter(range(len(y_pred)), y_pred, color='red')
plt.title(regressorName)
plt.show()
return
y_test = range(10)
y_pred = np.random.randint(0, 10, 10)
plotGraph(y_test, y_pred, "test")
This will give you something like this:
How to plot the predicted value against all features of a dataset
- Each feature must be plotted separately.
- Remember that
'price'
is the target, the dependant variable, and thatlin_reg.predict(xtrain)
is the predicted price from the training data.
# predicted price from xtrain
ypred_train = lin_reg.predict(xtrain)
# create the figure
fig, axes = plt.subplots(ncols=4, nrows=4, figsize=(20, 20))
# flatten the axes to make it easier to index
axes = axes.flatten()
# iterate through the column values, and use i to index the axes
for i, v in enumerate(xtrain.columns):
# seclect the column to be plotted
data = xtrain[v]
# plot the actual price against the features
axes[i].scatter(x=data, y=ytrain, s=35, ec='white', label='actual')
# plot predicted prices against the features
axes[i].scatter(x=data, y=ypred_train, c='pink', s=20, ec='white', alpha=0.5, label='predicted')
# set the title and ylabel
axes[i].set(title=f'Feature: {v}', ylabel='price')
# set a single legend
axes[12].legend(title='Price', bbox_to_anchor=(1.05, 1), loc='upper left')
# delete the last 3 unused axes
for v in range(13, 16):
fig.delaxes(axes[v])
- If you were to plot everything into a single plot, it would be overcrowded and useless
- You can also plot all the data with
seaborn.relplot
by meltingdf1
from a wide to long format.- However, it's more difficult to add the predicted values on top of a figure-level plot.
import seaborn as sns
dfm = df1.melt(id_vars='price', value_vars=df1.columns[:-1], var_name='Feature')
p = sns.relplot(kind='scatter', data=dfm, x='value', y='price', height=3,
col='Feature', col_wrap=4, facet_kws={'sharex': False})
Related Topics
How to Put a Space Between Two String Items in Python
How to Mention a User in Discord.Py
Python - Use Previous Row'S Value to Update the New Rows Values
Python Pandas - Get Row Based on Previous Row Value
How to Drop Rows from Pandas Data Frame That Contains a Particular String in a Particular Column
How to Insert String Value into Specific Column Value on Python Pandas
Pip Error: Microsoft Visual C++ 14.0 Is Required
I Received an Error Message That I Don't Quite Understand
Python: How to Print Separate Lines from a List
How to Update a Pyspark Dataframe With New Values from Another Dataframe
Combine Date and Time Columns Using Python Pandas
Python Db-Api: Fetchone VS Fetchmany VS Fetchall
Key Error When Selecting Columns in Pandas Dataframe After Read_Csv
Update Json Element in Json Object Using Python
Using Selenium in Python to Save a Webpage on Firefox
How to Check All Versions of Python Installed on Osx and Centos