监督分类:针对不同样本大小和k值绘制K-NN精度

问题描述 投票:0回答:2

希望你们知道很难在通用数据集上复制这样的内容。

基本上,我想做的是对两个不同大小的k的七个不同值的测试和训练集执行K-NN

我的主要问题是,res应该是一个向量,用于存储相同火车集大小的所有精度值,但是每次迭代显示一个值,这不允许我绘制精度图,因为它们看上去是空的。

您知道如何解决此问题吗?

数据可直接在R上免费获得。

data("Sonar")

#Randomization of the sample
set.seed(123)

random <- sample(rep(1:dim(Sonar)[1]))

Sonar <- Sonar[random,]
head(Sonar)


for (i in c(50,100)){   #train/test set size
  sonar.train <- Sonar[1:i,-61]
  sonar.train.label <- Sonar[1:i,61]
  sonar.test <- Sonar[(1+i) :208,-61]
  sonar.test.label <- Sonar[(1+i) :208 ,61]
  res <- rep(NA,7)
  for (j in c(3,5,7,9,11,13,15)){     #values of k
    mod = knn(train= sonar.train, test = sonar.test, cl = sonar.train.label, k = j) #classification for test set
    err = sum(sonar.test.label==mod) #accuracy
    res[match(j,c(3,5,7,9,11,13,15))] = err/length(mod)  #put accuracy value in vector
    print(res)
    plot(x = c(3,5,7,9,11,13,15) ,y = res, type = "l" ,col = "blue", xlab = "Neighbours", ylab = "Accuracy") #plot the accuracy graphs for each of the two different train/test sets
    res <- rep(NA,7)
  }
  }
#output
> 
 0.6835443        NA        NA        NA        NA        NA        NA
        NA 0.6582278        NA        NA        NA        NA        NA
        NA        NA 0.6075949        NA        NA        NA        NA
        NA        NA        NA 0.6265823        NA        NA        NA
        NA        NA        NA        NA 0.5949367        NA        NA
        NA        NA        NA        NA        NA 0.5949367        NA
        NA        NA        NA        NA        NA        NA 0.5506329
 0.6759259        NA        NA        NA        NA        NA        NA
        NA 0.6111111        NA        NA        NA        NA        NA
        NA        NA 0.5648148        NA        NA        NA        NA
        NA        NA        NA 0.5833333        NA        NA        NA
        NA        NA        NA        NA 0.5925926        NA        NA
        NA        NA        NA        NA        NA 0.5740741        NA
        NA        NA        NA        NA        NA        NA 0.5740741

精度图显示为空,并且x轴上的k具有不同的标签。

感谢您阅读和帮助我!

r for-loop plot classification supervised-learning
2个回答
1
投票

您的内部循环应该填充res中的值,每次迭代填充一次。但是,您似乎在循环的每次迭代结束时重置res。这就是为什么它不保留任何先前的值。

这两行必须是内循环外(以及外循环内)

  plot(x = c(3,5,7,9,11,13,15) ,y = res, type = "l" ,col = "blue", xlab = "Neighbours", ylab = "Accuracy") #plot the accuracy graphs for each of the two different train/test sets
  res <- rep(NA,7)

2
投票

[绘图功能和res的重新初始化应该在内部循环之外,否则,您需要在每个内部循环内将res重置为NA的向量。

新的for周期应如下

for (i in c(50,100)){   #train/test set size
  sonar.train <- Sonar[1:i,-61]
  sonar.train.label <- Sonar[1:i,61]
  sonar.test <- Sonar[(1+i) :208,-61]
  sonar.test.label <- Sonar[(1+i) :208 ,61]
  res <- rep(NA,7)
  for (j in c(3,5,7,9,11,13,15)){     #values of k
    mod = knn(train= sonar.train, test = sonar.test, cl = sonar.train.label, k = j) #classification for test set
    err = sum(sonar.test.label==mod) #accuracy
    res[match(j,c(3,5,7,9,11,13,15))] = err/length(mod)  #put accuracy value in vector
  }
  plot(x = c(3,5,7,9,11,13,15) ,y = res, type = "l" ,col = "blue", xlab = "Neighbours", ylab = "Accuracy", main = paste("i =", i)) #plot the accuracy graphs for each of the two different train/test sets
  res <- rep(NA,7)
}

顺便说一句,我在绘图函数中添加了main = paste("i =", i),以便识别循环所指的是哪个迭代。


编辑

我只有在写完答案后才意识到@Aziz抢占了我几秒钟:D

© www.soinside.com 2019 - 2024. All rights reserved.