我正在学习Python并尝试在不使用库的情况下开发KNN
这是我想采取的3个主要步骤,但是我的代码中充满了错误。
我正在使用的数据具有4个功能和两个类。
[请在下面查看我要执行的操作并帮助改进它-我得到的主要错误是:
TypeError: only size-1 arrays can be converted to Python scalars
计划在三个阶段进行处理:
准备数据,拆分(用于评估):
random.shuffle(iris)
#this is not working for me i dont know why?????
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)`
print (len (X_train)) # to recheck the sucess of the split
from math import sqrt
from collections import Counter
#new_measure=(X_new)
#X_new= [1,2,3,4]
distance=[]
for group in X_train:
for features in X_train:
Eu_dis= sqrt( (X_new [0]- X_train[0])**2 + (X_new [1]- X_train[1])**2+(X_new [2]- X_train[2])**2+(X_new [3]- X_train[3])**2)
此后我该如何进行?
这里都是需要的功能:1.计算两个向量之间的欧几里得距离2.找到最相似的邻居3.与邻居进行分类预测
# calculate the Euclidean distance between two vectors
def euclidean_distance(row1, row2):
distance = 0.0
for i in range(len(row1)-1):
distance += (row1[i] - row2[i])**2
return sqrt(distance)
# Locate the most similar neighbors
def get_neighbors(train, test_row, num_neighbors):
distances = list()
for train_row in train:
dist = euclidean_distance(test_row, train_row)
distances.append((train_row, dist))
distances.sort(key=lambda tup: tup[1])
neighbors = list()
for i in range(num_neighbors):
neighbors.append(distances[i][0])
return neighbors
# Make a classification prediction with neighbors
def predict_classification(train, test_row, num_neighbors):
neighbors = get_neighbors(train, test_row, num_neighbors)
output_values = [row[-1] for row in neighbors]
prediction = max(set(output_values), key=output_values.count)
return prediction