使用 k 均值聚类在两个 Igraph 图之间通过颜色指示相同的聚类

问题描述 投票:0回答:1

我有来自两个不同数据集的两个邻接矩阵,具有相同的变量:

Amat <- read.table(text = "
    Si  N1  N2  A1  A2  A3  A4  A5  Z1  Z2  Z3  Z5
Si 367   0   0  48   0 365   0   0   0   0   0   0
N1   0 368   0 368 275  58 365 360   0   0   0   0
N2   0   0 368   1 368   0   1   8   0   0   0   3
A1  37 272   0 103   0   8   0   0  18 351   0   2
A2   0 151 361   0   3   0   0   1   0   0 353  46
A3 269  49   0  39   0 116   0   0 345   9   0   0
A4   0 306   1   0   0   0 211  60   0   0   0   1
A5   0 215   8   0   0   0 145 248   0   0   0  10
Z1   0   0   0  17   0 351   0   0 355 240   1   3
Z2   0   0   0 350   0   9   0   0 201 213 211 175
Z3   0   0   0   0 362   0   0   0   1 195 103  11
Z5   0   0   3   2  45   0   1  10   3 177  12 348
", header = TRUE)

Amat2 <- read.table(text = "
    Si  N1  N2  A1  A2  A3  A4  A5  Z1  Z2  Z3  Z5
Si 639   0   0  89   0 637   0   0   0   0   0   0
N1   0 640   0 639 487 111 637 632   0   0   0   0
N2   0   0 640   1 639   0   1   8   0   0   0   7
A1  77 514   0 138   0  12   0   0  35 614   0   4
A2   0 281 632   0   3   0   0   1   1   1 609  70
A3 469  98   0  70   0 198   0   0 597  16   0   0
A4   0 529   1   0   0   0 406 110   0   0   0   1
A5   0 419   8   0   0   0 219 424   0   0   0  12
Z1   0   0   0  34   1 609   0   0 623 409   1   7
Z2   0   0   0 614   1  16   0   0 311 321 334 323
Z3   0   0   0   0 626   0   0   0   1 347 198  37
Z5   0   0   7   4  70   0   1  12   7 325  38 603
", header = TRUE)

我对两者都进行了聚类算法,并成功绘制了边和顶点 igraph:


library(igraph)
library(stats)
Amat <- as.matrix(Amat)
Amat2 <- as.matrix(Amat2)

#clustering dataset 1
Lmat <- diag(rowSums(Amat)) - Amat #laplacian matrix
D_half <- diag(sqrt(1/rowSums(Amat)))
normL <- D_half %*% Lmat %*% D_half #normalize laplacian
laplacian_eigen <- eigen(normL)
sorted_eigenvectors <- laplacian_eigen$vectors[, order(laplacian_eigen$values)]

# K-means clustering
library(stats)
kmeans_result <- kmeans(sorted_eigenvectors[, 1:8], centers = 8)


#clustering dataset 2
Lmat2 <- diag(rowSums(Amat2)) - Amat2 #laplacian matrix
D_half2 <- diag(sqrt(1/rowSums(Amat2)))
normL2 <- D_half2 %*% Lmat2 %*% D_half2 #normalize laplacian
laplacian_eigen2 <- eigen(normL2)
sorted_eigenvectors2 <- laplacian_eigen2$vectors[, order(laplacian_eigen2$values)]

kmeans_result2 <- kmeans(sorted_eigenvectors2[, 1:8], centers = 8)

############### plotting ##########################

# Create an igraph graph object from the adjacency matrix
graph1 <- graph_from_adjacency_matrix(Amat, mode = "undirected")
graph2 <- graph_from_adjacency_matrix(Amat2, mode = "undirected")

par(mfrow=c(1,2))
plot(graph1, vertex.color = kmeans_result$cluster, main = 'set1') #multi-plot
plot(graph2, vertex.color = kmeans_result2$cluster, main = 'set2')

但问题是它们很难比较,因为颜色不同:

在两个集合中的簇相同的情况下,我希望它们具有相同的颜色 - 就像 Z1 和 A3 的示例一样,它们在两个数据集中聚集在一起并且都是绿色的。因此,例如 N1 和 A4 应该是相同的颜色,因为它们在两个集合中都聚集在一起。

我尝试过订购 kmeans$clusters,但我做得不对,而且我不想将集群搞乱到变量分配。

r cluster-analysis igraph k-means
1个回答
0
投票

vertex.color 设置为相同:

par(mfrow=c(1,2))
plot(graph1, vertex.color = kmeans_result$cluster, main = 'set1') #multi-plot
plot(graph2, vertex.color = kmeans_result$cluster, main = 'set2')

© www.soinside.com 2019 - 2024. All rights reserved.