如何使用Python的循环从零开始开发KNN对数

问题描述 投票:0回答:1

我正在学习Python并尝试在不使用库的情况下开发KNN

这是我想采取的3个主要步骤,但是我的代码中充满了错误。

我正在使用的数据具有4个功能和两个类。

[请在下面查看我要执行的操作并帮助改进它-我得到的主要错误是:

TypeError: only size-1 arrays can be converted to Python scalars

计划在三个阶段进行处理:

  1. 准备数据,拆分(用于评估):

    random.shuffle(iris)
    #this is not working for me i dont know why?????
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)` 
    print (len (X_train)) # to recheck the sucess of the split 
    
  2. 测量距KNN的所有距离
    from math import sqrt
    from collections import Counter
    #new_measure=(X_new)
    #X_new= [1,2,3,4]
    distance=[]
    for group in X_train:
        for features in X_train:
           Eu_dis= sqrt( (X_new [0]- X_train[0])**2 + (X_new [1]- X_train[1])**2+(X_new [2]- X_train[2])**2+(X_new [3]- X_train[3])**2)
    
  3. 确定最近的KNN并确定最可能的类别

此后我该如何进行?

python knn
1个回答
0
投票

这里都是需要的功能:1.计算两个向量之间的欧几里得距离2.找到最相似的邻居3.与邻居进行分类预测

# calculate the Euclidean distance between two vectors
def euclidean_distance(row1, row2):
    distance = 0.0
    for i in range(len(row1)-1):
        distance += (row1[i] - row2[i])**2
    return sqrt(distance)

# Locate the most similar neighbors
def get_neighbors(train, test_row, num_neighbors):
    distances = list()
    for train_row in train:
        dist = euclidean_distance(test_row, train_row)
        distances.append((train_row, dist))
    distances.sort(key=lambda tup: tup[1])
    neighbors = list()
    for i in range(num_neighbors):
        neighbors.append(distances[i][0])
    return neighbors

# Make a classification prediction with neighbors
def predict_classification(train, test_row, num_neighbors):
    neighbors = get_neighbors(train, test_row, num_neighbors)
    output_values = [row[-1] for row in neighbors]
    prediction = max(set(output_values), key=output_values.count)
    return prediction
© www.soinside.com 2019 - 2024. All rights reserved.