KNN in R: 'train and class have different lengths'?
Without access to the data, it's really hard to help. However, I suspect that train_labels
should be a vector. So try
cl = train_labels[,1]
knn(train_points, test_points, cl, k = 5)
Also double check:
dim(train_points)
dim(test_points)
length(cl)
knn train and class have different lengths
The knn function in R requires the training data to only contain independent variables, as the dependent variable is called separately in the "cl" parameter. Changing the line below will fix this particular error.
pred_knn <- knn(train=df[,c(1,2)], test=df_new, cl=df$golf, k=1)
However, note that running the above line will throw another error. Since knn calculates the Euclidean distance between observations, it requires all independent variables to be numeric. These pages have helpful related information. I would recommend using a different classifier for this particular data set.
https://towardsdatascience.com/k-nearest-neighbors-algorithm-with-examples-in-r-simply-explained-knn-1f2c88da405c
https://discuss.analyticsvidhya.com/t/how-to-resolve-error-na-nan-inf-in-foreign-function-call-arg-6-in-knn/7280/4
Hope this helps.
train' and 'class' have different lengths error in R
Your cl
variable is not the same length as your train
variable. MLValidY
only has 74 observations, while TrainXNormDF
has 224.
cl
should provide the true classification for every row in your training set.
Furthermore, cl
is a data.frame instead of a vector.
Try the following:
NN <- knn(train = TrainXNormDF,
test = ValidXNormDF,
cl = MLTrainY$`MLTrain[, 9]`,
k = 3)
Why do we get error saying train and class have different lengths while using Knn function in R?
The class parameter should be provided as a vector, not as a dataframe. Referring to Diagnosis
variable in your wbcd_train_labels
dataframe should work
wbcd_test_pred <- knn(train = wbcd_train, test = wbcd_test,
cl = wbcd_train_labels$Diagnosis,...)
Related Topics
Accessing Parent Namespace Inside a Shiny Module
Combining Vectors of Unequal Length into a Data Frame
Difference Between Sort(), Rank(), and Order()
Error in If/While (Condition):Argument Is Not Interpretable as Logical
R: Adding a "Tool Tip" to Interactive Plot (Plotly)
How to Calculate Confidence Intervals for Nonlinear Least Squares in R
How to Know Which Cluster Do the New Data Belongs to After Finishing Cluster Analysis
How to Print Double Quotes (") in R
Convert from K to Thousand (1000) in R
Plot a Character Vector Against a Numeric Vector in R
How Is J() Function Implemented in Data.Table
Loop Through a Series of Qplots
Extract Last Non-Missing Value in Row with Data.Table
Create an Arrow with Gradient Color
Append Multiple CSV Files into One File Using R
Large Integers in Data.Table. Grouping Results Different in 1.9.2 Compared to 1.8.10