我一直在尝试使用KNN函数来启动我的预测,但是当我运行代码时它会抛出错误:
knn(data.frame(tr5 train),dataframe(tr5 test),cl = pred_ train_labels,:'train'和'class'的错误有不同的长度
我已经检查过所有数据集都是data.frame,并试图将标签用作向量而没有成功
以下是我用过的代码:
test_tr5_no_target<- test_tr5[-2]
tr5_train<- test_tr5_no_target[1:74475, , drop = FALSE]
tr5_test<- test_tr5_no_target[74476:93094, , drop = FALSE]
pred_train_labels<- test_tr5[1:74475, 2]
pred_test_labels<- test_tr5[74476:93094, 2]
#install.packages("class")
library(class)
##ensure all data is a dataframe
as.data.frame(tr5_train)
as.data.frame(tr5_test)
as.data.frame(pred_train_labels)
pred1<- knn(data.frame(tr5_train), data.frame(tr5_test), cl = pred_train_labels, k = 5)
请记住,标签列2是数字目标功能。我已经研究过所有并且无法找到引发此错误的内容,是否有任何我可能做错了?
感谢所有的帮助,非常感谢! (遗憾的是,由于受限制,我无法共享数据本身)
-Jose C.
直接回答你的问题:你想要你的标签(这里是pred_train_labels
)作为向量而不是数据框。我们可以使用mtcars
数据集重新创建您的错误。
library('tidyverse')
library('class')
set.seed(1)
x <- mtcars
target <- x[-1]
size <- floor(0.75 * nrow(x))
train_ind <- sample(seq_len(nrow(x)), size = size)
train <- x[train_ind, ]
test <- x[-train_ind, ]
label <- as.data.frame(x[1][train_ind, ]) #problem is here
test <- knn(train,test,cl = label, k = 5)
test
Error in knn(train, test, cl = label, k = 5) :
'train' and 'class' have different lengths
通过允许标签为向量然后从新的knn对象调用属性,我们可以得到一个输出:
train_ind <- sample(seq_len(nrow(x)), size = size)
train <- x[train_ind, ]
test <- x[-train_ind, ]
label <- x[1][train_ind, ] #NOT a dataframe
test <- knn(train,test,cl = label, k = 5, prob = TRUE)
attributes(test)
$`levels`
[1] "10.4" "14.3" "14.7" "15" "15.2" "15.8" "16.4" "17.3"
[9] "17.8" "18.7" "19.2" "19.7" "21" "21.4" "22.8" "24.4"
[17] "26" "30.4" "32.4"
探索??knn
中的示例也表明了这一点。