如何使用sklearn中的DBSCAN方法进行聚类

问题描述 投票:0回答:1

我有一个用于聚类的三参数数据库。例如,我可以通过sklearn从Kmean轻松获得图像结果,例如:(val是我的数据库,其形状像(3000,3))

y_pred = KMeans(n_clusters= 4 , random_state=0).fit_predict(val)
fig = plt.figure()
ax1 = fig.add_subplot(1,1,1,projection='3d')
ax1.scatter(val[:, 0], val[:, 1], val[:, 2], c=y_pred)
plt.show()

但是,在DBSCAN中,我只是直接使用这个:

from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler
val = StandardScaler().fit_transform(val)
db = DBSCAN(eps=3, min_samples=4).fit(val)
labels = db.labels_
core_samples = np.zeros_like(labels, dtype=bool)
core_samples[db.core_sample_indices_] =True

# Number of clusters in labels, ignoring noise if present.
n_clusters_ = len(set(labels)) - (1 if -1 in labels else 0)
n_noise_ = list(labels).count(-1)

那么,如何像Kmean一样获得DBSCAN的图像结果?

python scikit-learn cluster-analysis dbscan
1个回答
0
投票

您可以从KMeans模型中重复使用相同的代码。您所需要做的就是重新分配valy_pred以忽略噪声标签。

# DBSCAN snippet from the question
from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler
val = StandardScaler().fit_transform(val)
db = DBSCAN(eps=3, min_samples=4).fit(val)
labels = db.labels_

# re-assign y_pred and core (as val)
y_pred, core = labels[labels != -1], val[labels != -1]

# plotting snippet from the question
fig = plt.figure()
ax1 = fig.add_subplot(1,1,1,projection='3d')
ax1.scatter(core[:, 0], core[:, 1], core[:, 2], c=y_pred)
plt.show()
© www.soinside.com 2019 - 2024. All rights reserved.