如何将无监督的层次聚类结果与原始数据合并

问题描述投票：0回答：1

我在R中进行了无监督的层次聚类分析。我的数据是3列和120,000行左右的数字。我设法使用切割树并识别出6个群集。现在，我需要将这些集群返回到原始数据，即添加另一列以指示集群组（6之1）。我该怎么做？

# Ward's method
hc5 <- hclust(d, method = "ward.D2" )

# Cut tree into 6 groups
sub_grp <- cutree(hc5, k = 6)

# Number of members in each cluster
table(sub_grp)

我需要它，因为我的数据具有空间链接，因此，我想将这些聚类映射回它们在地图上的位置。感谢您的帮助。

r merge tree hierarchical-clustering

1个回答

0
投票

变量sub_grp只是群集分配的向量，因此您可以将其添加到数据框中：

data(iris)                         # Data frame available in base R.
str(iris)
d <- dist(iris[, -5])              # Column 5 is the species name so we drop it
hc5 <- hclust(d, method="ward.D2")
sub_grp <- cutree(hc5, k=3)
str(sub_grp)
iris$grp <- sub_grp
str(iris)
aggregate(iris[, 1:4,], by=list(iris$grp), mean)
xtabs(~grp+Species, iris)

最后两个命令按组计算4个数字变量的均值，并用已知种类交叉列出聚类分配。您实际上不需要将群集分配添加到数据框中。 R使您可以组合来自不同对象的变量，只要它们具有相同的行数即可。

最新问题

© www.soinside.com 2019 - 2024. All rights reserved.