我有来自两个不同数据集的两个邻接矩阵,具有相同的变量:
Amat <- read.table(text = "
Si N1 N2 A1 A2 A3 A4 A5 Z1 Z2 Z3 Z5
Si 367 0 0 48 0 365 0 0 0 0 0 0
N1 0 368 0 368 275 58 365 360 0 0 0 0
N2 0 0 368 1 368 0 1 8 0 0 0 3
A1 37 272 0 103 0 8 0 0 18 351 0 2
A2 0 151 361 0 3 0 0 1 0 0 353 46
A3 269 49 0 39 0 116 0 0 345 9 0 0
A4 0 306 1 0 0 0 211 60 0 0 0 1
A5 0 215 8 0 0 0 145 248 0 0 0 10
Z1 0 0 0 17 0 351 0 0 355 240 1 3
Z2 0 0 0 350 0 9 0 0 201 213 211 175
Z3 0 0 0 0 362 0 0 0 1 195 103 11
Z5 0 0 3 2 45 0 1 10 3 177 12 348
", header = TRUE)
和
Amat2 <- read.table(text = "
Si N1 N2 A1 A2 A3 A4 A5 Z1 Z2 Z3 Z5
Si 639 0 0 89 0 637 0 0 0 0 0 0
N1 0 640 0 639 487 111 637 632 0 0 0 0
N2 0 0 640 1 639 0 1 8 0 0 0 7
A1 77 514 0 138 0 12 0 0 35 614 0 4
A2 0 281 632 0 3 0 0 1 1 1 609 70
A3 469 98 0 70 0 198 0 0 597 16 0 0
A4 0 529 1 0 0 0 406 110 0 0 0 1
A5 0 419 8 0 0 0 219 424 0 0 0 12
Z1 0 0 0 34 1 609 0 0 623 409 1 7
Z2 0 0 0 614 1 16 0 0 311 321 334 323
Z3 0 0 0 0 626 0 0 0 1 347 198 37
Z5 0 0 7 4 70 0 1 12 7 325 38 603
", header = TRUE)
我对两者都进行了聚类算法,并成功绘制了边和顶点 igraph:
library(igraph)
library(stats)
Amat <- as.matrix(Amat)
Amat2 <- as.matrix(Amat2)
#clustering dataset 1
Lmat <- diag(rowSums(Amat)) - Amat #laplacian matrix
D_half <- diag(sqrt(1/rowSums(Amat)))
normL <- D_half %*% Lmat %*% D_half #normalize laplacian
laplacian_eigen <- eigen(normL)
sorted_eigenvectors <- laplacian_eigen$vectors[, order(laplacian_eigen$values)]
# K-means clustering
library(stats)
kmeans_result <- kmeans(sorted_eigenvectors[, 1:8], centers = 8)
#clustering dataset 2
Lmat2 <- diag(rowSums(Amat2)) - Amat2 #laplacian matrix
D_half2 <- diag(sqrt(1/rowSums(Amat2)))
normL2 <- D_half2 %*% Lmat2 %*% D_half2 #normalize laplacian
laplacian_eigen2 <- eigen(normL2)
sorted_eigenvectors2 <- laplacian_eigen2$vectors[, order(laplacian_eigen2$values)]
kmeans_result2 <- kmeans(sorted_eigenvectors2[, 1:8], centers = 8)
############### plotting ##########################
# Create an igraph graph object from the adjacency matrix
graph1 <- graph_from_adjacency_matrix(Amat, mode = "undirected")
graph2 <- graph_from_adjacency_matrix(Amat2, mode = "undirected")
par(mfrow=c(1,2))
plot(graph1, vertex.color = kmeans_result$cluster, main = 'set1') #multi-plot
plot(graph2, vertex.color = kmeans_result2$cluster, main = 'set2')
但问题是它们很难比较,因为颜色不同:
在两个集合中的簇相同的情况下,我希望它们具有相同的颜色 - 就像 Z1 和 A3 的示例一样,它们在两个数据集中聚集在一起并且都是绿色的。因此,例如 N1 和 A4 应该是相同的颜色,因为它们在两个集合中都聚集在一起。
我尝试过订购 kmeans$clusters,但我做得不对,而且我不想将集群搞乱到变量分配。