K-Means 聚类中距质心最近的 M 个点

问题描述 投票:0回答:2

我实现了一个函数,用于查找运行 K 均值聚类算法后计算出的每个质心的“最近”数据点。我想知道是否有一个 sklearn 函数可以让我找到距离每个质心最近的

M
点。

python scikit-learn cluster-analysis k-means centroid
2个回答
1
投票
sklearn.neighbors.NearestNeighbors

来拟合我们的数据集。然后,我们可以使用 K 均值质心查询最近邻模型来检索邻居。像这样: # Copyright 2024 Google LLC. # SPDX-License-Identifier: Apache-2.0 from sklearn.cluster import KMeans from sklearn.neighbors import NearestNeighbors # random dense embeddings for 100 points with 10 dimensions. dataset = np.random.rand(100,10) # fit K-means with 3 clusters on our dataset. kme = KMeans(n_clusters=3) kme.fit(dataset) # we should have 3 vectors for 3 centroids. print(kme.cluster_centers_.shape) # (3, 10) # initialize NearestNeighbor with 5 neighbors and fit our dataset. knn = NearestNeighbors(n_neighbors=5, metric='cosine') knn.fit(dataset) # Use the model to query the centroids' neighbors. distances, indices = knn.kneighbors(kme.cluster_centers_) for centroid, distance_from_centroid, index in zip(kme.cluster_centers_, distances, indices): print(centroid, distance_from_centroid, index)

最后一个循环将输出 3 行。每个都类似于质心的向量以及其最近邻居的 5 个距离和索引。


-2
投票
http://scikit-learn.org/stable/modules/neighbors.html

类sklearn.neighbors.NearestNeighbors为您找到它:

http://scikit-learn.org/stable/modules/ generated/sklearn.neighbors.NearestNeighbors.html#sklearn.neighbors.NearestNeighbors

© www.soinside.com 2019 - 2024. All rights reserved.