我想绘制一个带有R的网络。但是,问题是我的数据框(包含关系数据)也包含了孩子的孩子。看起来如下:
parent <- c("A","A","A","B","B","E")
child <- c("B","C","D","C","D","D")
df <- data.frame(parent,child)
我想删除df
中的孩子的孩子,以便可以使用igraph绘制我的网络。所以基本上,我希望我的数据看起来像df_net
:
parent <- c("A","B","B","E")
child <- c("B","C","D","D")
df_net <- data.frame(parent,child)
net <- graph_from_data_frame(df_net,directed = T)
plot(net)
(自动)删除不必要的df
行的最佳方法是什么? (我有几个数据行,最多包含100行-因此,不能手动删除行。)
我的第一个想法是使用while循环在每个分层步骤中查找父级。我以为我们可以过滤df
中的行。但是我不认为我在正确的轨道上。任何想法表示赞赏!
`%notin%` <- Negate(`%in%`)
i <- nrow(df)
y <- list()
z <- list()
j <- 1
while (i > 0) {
v <- unique(df$parent[!(df$parent %in% df$child)]) # find mismatch (only in parent, not in child)
df <- df %>% filter(parent %notin% v)
print(v)
y[[j]] <- v
z[[j]] <- df
i = nrow(df)
j = j+1
}
如何使用igraph确定个人之间的路径,如果连接数超过1,则删除该连接?
library(igraph)
parent <- c("A","A","A","B","B","E")
child <- c("B","C","D","C","D","D")
df <- data.frame(parent,child)
net <- graph_from_data_frame(df,directed = T)
df <- df[apply(df,1,function(x){length(all_simple_paths(net,x[1],x[2]))}) == 1,]
df
parent child
1 A B
4 B C
5 B D
6 E D
我担心在非常大的图形上这可能会很慢,因此,如果有人有一个data.table解决方案,那可能会更好。
这是dplyr
解决方案:
library(dplyr)
# get data
parent <- c("A","A","A","B","B","E")
child <- c("B","C","D","C","D","D")
df <- data.frame(parent,child, stringsAsFactors = FALSE)
# remove rows that are not directly related
new_df <- anti_join(df,
left_join(df,df,by=c("child"="parent")) %>%
select(parent,child=child.y) %>%
na.omit())
new_df
parent child
1 A B
2 B C
3 B D
4 E D