与knn的y轴不匹配的样本

问题描述 投票:0回答:1

[我试图使自己的方法比基于虹膜数据集的教程更灵活地输入knn输入脚本,但我在将匹配的第二维添加到#6中的numpy数组时遇到了麻烦(我认为),当我来时至#11。配件。

文件“ G:\ PROGRAMMERING \ Anaconda \ lib \ site-packages \ sklearn \ utils \ validation.py”,行212,在check_consistent_length中“样本:%r”%[长度为l的int(l)])ValueError:找到样本数量不一致的输入变量:[150,1]

x为(150,5),y为(150,1)。两者中的样本数均为150,但它们的字段数不同,这是问题所在,如果是,该如何解决?

#1. Loading the Pandas libraries as pd
import pandas as pd
import numpy as np

#2. Read data from the file 'custom.csv' placed in your code directory
data = pd.read_csv("custom.csv") 

#3. Preview the first 5 lines of the loaded data 
print(data.head())
print(type(data))

#4.Test the shape of the data
print(data.shape)
df = pd.DataFrame(data)
print(df)

#5. Convert non-numericals to numericals
print(df.dtypes)
# Any object should be converted to numerical
df['species'] = pd.Categorical(df['species'])
df['species'] = df.species.cat.codes
print("outcome:")
print(df.dtypes)

#6.Convert df to numpy.ndarray
np = df.to_numpy()
print(type(np)) #this should state <class 'numpy.ndarray'>
print(data.shape) 
print(np)
x = np.data
y = [df['species']]
print(y)

#K-nearest neighbor (find closest) - searach for the K nearest observations in the dataset
#The model calculates the distance to all, and selects the K nearest ones.
#8. Import the class you plan to use
from sklearn.neighbors import (KNeighborsClassifier)
#9. Pick a value for K
k = 2
#10. Instantiate the "estimator" (make an instance of the model)
knn  = KNeighborsClassifier(n_neighbors=k)
print(knn)
#11. fit the model with data/model training
knn.fit(x, y)
#12. Predict the response for a new observation
print(knn.predict([[3, 5, 4, 2]]))```
python scikit-learn numpy-ndarray knn
1个回答
0
投票
这是我使用scikit-learn KNeighborsClassifier拟合knn模型的方式:

import numpy as np import pandas as pd from sklearn import datasets from sklearn.neighbors import KNeighborsClassifier df = datasets.load_iris() X = pd.DataFrame(df.data) y = df.target knn = KNeighborsClassifier(n_neighbors = 2) knn.fit(X,y) print(knn.predict([[6, 3, 5, 2]])) #prints output class [2] print(knn.predict([[3, 5, 4, 2]])) #prints output class [1]

不需要从DataFrame转换为numpy array,您可以直接在DataFrame上拟合模型,也可以在将DataFrame转换为numpy array时将其命名为np,即也用于在顶部numpy上导入import numpy as np
© www.soinside.com 2019 - 2024. All rights reserved.