如何在python的聚集聚类算法中打印每个聚类的数据

Question

我是python机器学习工具的新手，我编写了这种聚集式层次聚类的代码，但是我不知道是否有任何方法可以打印每个绘图聚类的数据。算法的输入是5个数字（0,1,2,3,4），除了绘制簇外，我还需要像这样单独打印每个簇的值cluster1 = [1,2,4]cluster2 = [0,3]

update：我想获取根据此行和其他行plt.scatter(points[y_hc==0,0], points[y_hc==0,1],s=100,c='cyan')绘制和着色的数据，根据此代码，这些数字（1,2,4）位于一个群集中，并且具有相同的颜色和（0,3）位于cluster2中，因此，我需要在终端中打印这些数据（每个集群的数据）。该代码只是绘图数据。

import numpy as np 
import matplotlib.pyplot as plt 
from sklearn.datasets import make_blobs
dataset= make_blobs(n_samples=5, n_features=2,centers=4, cluster_std=1.6, random_state=50)
points= dataset[0]

import scipy.cluster.hierarchy as sch 
from sklearn.cluster import AgglomerativeClustering

dendrogram = sch.dendrogram(sch.linkage(points,method='ward'))
plt.scatter(dataset[0][:,0],dataset[0][:,1])
hc = AgglomerativeClustering(n_clusters=4, affinity='euclidean',linkage='ward')
y_hc= hc.fit_predict(points)
plt.scatter(points[y_hc==0,0], points[y_hc==0,1],s=100,c='cyan')
plt.scatter(points[y_hc==1,0], points[y_hc==1,1],s=100,c='yellow')
plt.scatter(points[y_hc==2,0], points[y_hc==2,1],s=100,c='red')
plt.scatter(points[y_hc==3,0], points[y_hc==3,1],s=100,c='green')
plt.show()

Answer 1

进行了一些研究，似乎没有一种简单的方法可以从scipy的dendrogram函数中获取聚类标签。

以下是几个选项/解决方法。

选项一

使用scipy的linkage和fcluster功能执行聚类并获得标签：

Z = sch.linkage(points, 'ward') # Note 'ward' is specified here to match the linkage used in sch.dendrogram.
labels = sch.fcluster(Z, t=10, criterion='distance') # t chosen to return two clusters.

# Cluster 1
np.where(labels == 1)

输出：(array([0, 3]),)

# Cluster 2
np.where(labels == 2)

输出：(array([1, 2, 4]),)

选项二

修改您当前对sklearn的使用以返回两个簇：

hc = AgglomerativeClustering(n_clusters=2, affinity='euclidean',linkage='ward') # Again, 'ward' is specified here to match the linkage in sch.dendrogram.
y_hc = hc.fit_predict(points)

# Cluster 1
np.where(y_hc == 0)

输出：(array([0, 3]),)

# Cluster 2
np.where(y_hc == 1)

输出：(array([1, 2, 4]),)

如何在python的聚集聚类算法中打印每个聚类的数据

问题描述投票：0回答：1

1个回答

选项一

选项二

最新问题

如何在python的聚集聚类算法中打印每个聚类的数据

问题描述 投票：0回答：1

1个回答

选项一

选项二

最新问题

问题描述投票：0回答：1