识别受边缘属性约束的连接子网

问题描述 投票:0回答:4

我正在处理网络数据,想要识别连接的子网。我使用
加载了我的数据

graph <- graph_from_data_frame(filtered_df, directed = FALSE)
并绘制了我的网络

plot(graph) 

E(graph)$conflict_period
[1] "72_1"   "72_1"   "72_1"   "72_1"   "72_1"   "72_1"   "72_1"   "72_1"   "72_1"   "72_1"   "72_1"   "372_1"  "372_1"  "372_1" 
[15] "372_1"  "372_1"  "372_1"  "372_1"  "372_1"  "372_1"  "372_1"  "522_0"  "522_0"  "522_0"  "522_0"  "522_0"  "522_0"  "522_0" 
[29] "522_0"  "522_0"  "522_0"  "715_0"  "715_0"  "715_0"  "715_0"  "5390_0"

到目前为止,子组的信息都存储在边中。节点属于它接收边的每个子组,因此节点也可以属于多个子组。例如,马里政府与平民一起属于子组“72_1”,与 FIAA 一起属于子组“372_1”。我想知道子组“72_1”和“372_1”是否已连接,如果属于子组“72_1”的至少两个节点通过边连接到属于子组“372_1”的至少两个节点。我厌倦了网络分析以外的方法来识别这种关系,但失败了。现在我在这里寻求帮助。

所需的输出将是一个表格,列出基于上述标准的连接子组。在这种情况下应该是:

冲突_时期 已连接
72_1 372_1,522_0
372_1 72_1,522_0
522_0 372_1,72_1
715_0 不适用
5390_0 不适用

这是使用的数据:

structure(list(side_a = c("Government of Mali", "Government of Mali", 
"Government of Mali", "Government of Mali", "Government of Mali", 
"Government of Mali", "Government of Mali", "Government of Mali", 
"Government of Mali", "Government of Mali", "Government of Mali", 
"Government of Mali", "Government of Mali", "Government of Mali", 
"Government of Mali", "Government of Mali", "Government of Mali", 
"Government of Mali", "Government of Mali", "Government of Mali", 
"Government of Mali", "FIAA", "FIAA", "FIAA", "FIAA", "FIAA", 
"FIAA", "FIAA", "FIAA", "FIAA", "FIAA", "MPGK", "MPGK", "MPGK", 
"MPGK", "ARLA, FIAA, FPLA"), side_b = c("Civilians", "Civilians", 
"Civilians", "Civilians", "Civilians", "Civilians", "Civilians", 
"Civilians", "Civilians", "Civilians", "Civilians", "FIAA", "FIAA", 
"FIAA", "FIAA", "FIAA", "FIAA", "FIAA", "FIAA", "FIAA", "FIAA", 
"Civilians", "Civilians", "Civilians", "Civilians", "Civilians", 
"Civilians", "Civilians", "Civilians", "Civilians", "Civilians", 
"Civilians", "Civilians", "Civilians", "Civilians", "MPA"), country = c("Mali", 
"Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", 
"Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", 
"Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", 
"Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", 
"Mali", "Mali", "Mali"), period_start = structure(c(765158400, 
765158400, 765158400, 765158400, 765158400, 765158400, 765158400, 
765158400, 765158400, 765158400, 765158400, 770256000, 770256000, 
770256000, 770256000, 770256000, 770256000, 770256000, 770256000, 
770256000, 770256000, 771552000, 771552000, 771552000, 771552000, 
771552000, 771552000, 771552000, 771552000, 771552000, 771552000, 
769910400, 769910400, 769910400, 769910400, 771206400), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), period_end = structure(c(817862400, 817862400, 817862400, 
817862400, 817862400, 817862400, 817862400, 817862400, 817862400, 
817862400, 817862400, 819158400, 819158400, 819158400, 819158400, 
819158400, 819158400, 819158400, 819158400, 819158400, 819158400, 
816739200, 816739200, 816739200, 816739200, 816739200, 816739200, 
816739200, 816739200, 816739200, 816739200, 814492800, 814492800, 
814492800, 814492800, 802828800), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), conflict_period = c("72_1", "72_1", "72_1", "72_1", 
"72_1", "72_1", "72_1", "72_1", "72_1", "72_1", "72_1", "372_1", 
"372_1", "372_1", "372_1", "372_1", "372_1", "372_1", "372_1", 
"372_1", "372_1", "522_0", "522_0", "522_0", "522_0", "522_0", 
"522_0", "522_0", "522_0", "522_0", "522_0", "715_0", "715_0", 
"715_0", "715_0", "5390_0")), row.names = c(NA, -36L), class = c("tbl_df", 
"tbl", "data.frame"))
r algorithm cluster-analysis igraph network-analysis
4个回答
3
投票

也许您应该使用

decompose
来分隔子图,然后检索边中的“子组”信息,例如,

df %>%
    graph_from_data_frame(directed = FALSE) %>%
    decompose() %>%
    map(\(x) unique(E(x)$conflict_period))

输出

[[1]]
[1] "72_1"  "372_1" "522_0" "715_0"

[[2]]
[1] "5390_0"

2
投票

根据您的最新更新,这里有一个选项可以实现您想要的输出

library(igraph)
library(dplyr)

# simplify the dataframe and generate a graph
g <- df %>%
    select(matches("^side|conflict")) %>%
    distinct() %>%
    graph_from_data_frame(directed = FALSE)

# repeat pruning the graph and retaining the subgraph(s) such that each node has the degree >= 2
gh <- g
repeat {
    d <- degree(gh)
    if (min(d) >= 2) break
    gh <- induced_subgraph(gh, V(gh)[d >= 2])
}

# retrieve the vertices of each desired cluster
clt <- lapply(decompose(gh), \(x) E(x)$conflict_period)

# go through each conflict_period and if they are in one of the found clusters
out <- sapply(
    E(g)$conflict_period,
    \(p) {
        lapply(clt, \(q) {
            if (p %in% q) {
                setdiff(q, p)
            } else {
                NA
            }
        })
    }
)

# produce the desired output
res <- within(
    data.frame(conflict_period = names(out)),
    connected <- out
)

这样

> res
  conflict_period    connected
1            72_1 372_1, 522_0
2           372_1  72_1, 522_0
3           522_0  372_1, 72_1
4           715_0           NA
5          5390_0           NA

0
投票

使用

igraph::groups()
查找连接的集群/子网:

filtered_df <- structure(list(side_a = c(
  "Government of Mali", "Government of Mali",
  "Government of Mali", "Government of Mali", "Government of Mali",
  "Government of Mali", "Government of Mali", "Government of Mali",
  "Government of Mali", "Government of Mali", "Government of Mali",
  "Government of Mali", "Government of Mali", "Government of Mali",
  "Government of Mali", "Government of Mali", "Government of Mali",
  "Government of Mali", "Government of Mali", "Government of Mali",
  "Government of Mali", "FIAA", "FIAA", "FIAA", "FIAA", "FIAA",
  "FIAA", "FIAA", "FIAA", "FIAA", "FIAA", "MPGK", "MPGK", "MPGK",
  "MPGK", "ARLA, FIAA, FPLA"
), side_b = c(
  "Civilians", "Civilians",
  "Civilians", "Civilians", "Civilians", "Civilians", "Civilians",
  "Civilians", "Civilians", "Civilians", "Civilians", "FIAA", "FIAA",
  "FIAA", "FIAA", "FIAA", "FIAA", "FIAA", "FIAA", "FIAA", "FIAA",
  "Civilians", "Civilians", "Civilians", "Civilians", "Civilians",
  "Civilians", "Civilians", "Civilians", "Civilians", "Civilians",
  "Civilians", "Civilians", "Civilians", "Civilians", "MPA"
), country = c(
  "Mali",
  "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali",
  "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali",
  "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali",
  "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali", "Mali",
  "Mali", "Mali", "Mali"
), period_start = structure(c(
  765158400,
  765158400, 765158400, 765158400, 765158400, 765158400, 765158400,
  765158400, 765158400, 765158400, 765158400, 770256000, 770256000,
  770256000, 770256000, 770256000, 770256000, 770256000, 770256000,
  770256000, 770256000, 771552000, 771552000, 771552000, 771552000,
  771552000, 771552000, 771552000, 771552000, 771552000, 771552000,
  769910400, 769910400, 769910400, 769910400, 771206400
), tzone = "UTC", class = c(
  "POSIXct",
  "POSIXt"
)), period_end = structure(c(
  817862400, 817862400, 817862400,
  817862400, 817862400, 817862400, 817862400, 817862400, 817862400,
  817862400, 817862400, 819158400, 819158400, 819158400, 819158400,
  819158400, 819158400, 819158400, 819158400, 819158400, 819158400,
  816739200, 816739200, 816739200, 816739200, 816739200, 816739200,
  816739200, 816739200, 816739200, 816739200, 814492800, 814492800,
  814492800, 814492800, 802828800
), tzone = "UTC", class = c(
  "POSIXct",
  "POSIXt"
)), conflict_period = c(
  "72_1", "72_1", "72_1", "72_1",
  "72_1", "72_1", "72_1", "72_1", "72_1", "72_1", "72_1", "372_1",
  "372_1", "372_1", "372_1", "372_1", "372_1", "372_1", "372_1",
  "372_1", "372_1", "522_0", "522_0", "522_0", "522_0", "522_0",
  "522_0", "522_0", "522_0", "522_0", "522_0", "715_0", "715_0",
  "715_0", "715_0", "5390_0"
)), row.names = c(NA, -36L), class = c(
  "tbl_df",
  "tbl", "data.frame"
))

graph <- igraph::graph_from_data_frame(filtered_df, directed = FALSE)

plot(graph)


graph |>
  igraph::cluster_optimal() |>
  igraph::groups()
#> $`1`
#> [1] "Government of Mali" "FIAA"               "MPGK"              
#> [4] "Civilians"         
#> 
#> $`2`
#> [1] "ARLA, FIAA, FPLA" "MPA"

创建于 2024-02-29,使用 reprex v2.1.0


0
投票

拉帮结派

                                                    # Sanitize input.
se <- E(graph)[which(count_multiple(graph) == 1)]   # Single edge.
# [1] "5390_0"
se$conflict_period                                  # Show. deletedconflicts
g2 <- delete_edges(graph, se)                       # Keep multiple edges.
g3 <- delete.vertices(g2, which(degree(g2)==0))     # Remove isolates.
g4 <- simplify(g3, edge.attr.comb = "first")        # Remove multiple edges.
E(g4)$label <- E(g4)$conflict_period                # Show edge label in plot().
plot(g4)

观察任何 3 周期 a、b、c 都会引发:

conflict_period  | connected
a                | b, c
b                | a, c
c                | a,b

在这个小例子中,我们可以直接从图中读取解决方案。

                                            # show cliques.
cq  <- max_cliques(g4, min = 2, max=3)      # Find all 2,3-cycles.
tri <- list()                               # create a list of all cycles.
for (v in cq) tri[[length(tri)+1]] <- E(subgraph(g4, v))$conflict_period
tri
## [[1]]
## [1] "715_0"
## 
## [[2]]
## [1] "372_1" "72_1"  "522_0"
© www.soinside.com 2019 - 2024. All rights reserved.