Preprocessing in scikit learn - single sample - Depreciation warning
Just listen to what the warning is telling you:
Reshape your data either X.reshape(-1, 1) if your data has a single feature/column
and X.reshape(1, -1) if it contains a single sample.
For your example type(if you have more than one feature/column):
temp = temp.reshape(1,-1)
For one feature/column:
temp = temp.reshape(-1,1)
warning message in scikit-learn
The input to clf.predict
should be a 2D array. Thus, instead of writing
print(clf.predict([0,1]))
you need to write
print(clf.predict([[0,1]]))
Accuracy of preprocessing single sample
You should use StandardScaler
which is a wrapper over the scale
function as described here. This wrapper stores the mean and standard deviation learned from the training data and then uses this information to scale the other data.
Example usage:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
trainData = scaler.fit_transform(trainData)
# I have used reshape because of single sample. In other cases, its not needed
log = scaler.transform(np.reshape(log, (1,-1)))
fit_transform()
is just a shortcut for first calling fit()
and then transform()
.
fit()
method does not return anything. It just analyses the data to learn the mean and standard_deviation. transform()
will use the learnt mean and std to scale the data and returns the new data.
You should only call fit()
or fit_transform()
on the training data,never on anything else. For transforming the test or new data, always use transform()
.
Sklearn train model with single sample raises a DeprecationWarning
If you read the error message you can see that passing single dimensional arrays will soon not be supported. Instead you have to ensure that your single sample looks like a list of samples, in which there is just one. When dealing with NumPy arrays (which is recommended), you can use reshape(-1, 1)
however as you're using lists then the following will do:
clf = clf.fit([[130, 1]], [0])
Getting a weird error that says 'Reshape your data either using array.reshape(-1, 1)'
Any sklearn.Transformer
expects a [sample size, n_features]
sized array. So there's two scenarios you will have to reshape your data,
- If you only have a single sample, you need to reshape it to [1, n_features] sized array
- If you have only a single feature, you need to reshape it to [sample size, 1] sized array
So you need to do what suits the problem. You are passing a 1D vector.
[1. 1. 1. ... 8. 1. 1.]
If this is a single sample, reshape it to (1, -1) sized array and you will be fine. But with that said you might want to think about the following.
- If this is a single sample, there's no point in fitting a model with a single sample. You won't get any benefit.
- If this is a set of samples with a single feature, I don't really see a benefit in doing K-means on such a dataset.
Getting deprecation warning in Sklearn over 1d array, despite not having a 1D array
The error is coming from the predict method. Numpy will interpret [1,1] as a 1d array. So this should avoid the warning:
clf.predict(np.array([[1,1]]))
Notice that:
In [14]: p1 = np.array([1,1])
In [15]: p1.shape
Out[15]: (2,)
In [16]: p2 = np.array([[1,1]])
In [17]: p2.shape
Out[17]: (1, 2)
Also, note that you can't use an array of shape (2,1)
In [21]: p3 = np.array([[1],[1]])
In [22]: p3.shape
Out[22]: (2, 1)
In [23]: clf.predict(p3)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-23-e4070c037d78> in <module>()
----> 1 clf.predict(p3)
/home/juan/anaconda3/lib/python3.5/site-packages/sklearn/svm/base.py in predict(self, X)
566 Class labels for samples in X.
567 """
--> 568 y = super(BaseSVC, self).predict(X)
569 return self.classes_.take(np.asarray(y, dtype=np.intp))
570
/home/juan/anaconda3/lib/python3.5/site-packages/sklearn/svm/base.py in predict(self, X)
303 y_pred : array, shape (n_samples,)
304 """
--> 305 X = self._validate_for_predict(X)
306 predict = self._sparse_predict if self._sparse else self._dense_predict
307 return predict(X)
/home/juan/anaconda3/lib/python3.5/site-packages/sklearn/svm/base.py in _validate_for_predict(self, X)
472 raise ValueError("X.shape[1] = %d should be equal to %d, "
473 "the number of features at training time" %
--> 474 (n_features, self.shape_fit_[1]))
475 return X
476
ValueError: X.shape[1] = 1 should be equal to 2, the number of features at training time
Scikit-learn tutorial gives me a depreciation error, how to update?
Try the following:
print ("A 12-inch pizza should cost: $%.2f" % model.predict(np.array([12]).reshape(1, -1)[0]))
I used reshape(1,-1)
for passing 2d array to predict
function.
Related Topics
I'm Getting "Typeerror: 'List' Object Is Not Callable". How to Fix This Error
"Ssl: Certificate_Verify_Failed" Error When Scraping Https://Www.Thenewboston.Com/
How to Replace Django's Primary Key with a Different Integer That Is Unique for That Table
Convert Bytes to Bits in Python
In Python, How to Put a Thread to Sleep Until a Specific Time
Catch Exception and Continue Try Block in Python
Recursive Definitions in Pandas
Importerror: No Module Named <Something>
How to Rotate a Matplotlib Plot Through 90 Degrees
Typeerror: 'Range' Object Does Not Support Item Assignment
Region: Ioerror: [Errno 22] Invalid Mode ('W') or Filename
Remove Duplicate Rows from Pandas Dataframe Where Only Some Columns Have the Same Value
How to Set Selenium Webdriver from Headless Mode to Normal Mode Within the Same Session
How to Change UI in Same Window Using Pyqt5
Convert Spreadsheet Number to Column Letter
Restricting the Value in Tkinter Entry Widget