如何删除数据框架中的子项以在R中绘制网络?

问题描述 投票:1回答:2

我想绘制一个带有R的网络。但是,问题是我的数据框(包含关系数据)也包含了孩子的孩子。看起来如下:

parent <- c("A","A","A","B","B","E")
child <-  c("B","C","D","C","D","D")
df <- data.frame(parent,child)

我想删除df中的孩子的孩子,以便可以使用igraph绘制我的网络。所以基本上,我希望我的数据看起来像df_net

parent <- c("A","B","B","E")
child <-  c("B","C","D","D")
df_net <- data.frame(parent,child)
net <- graph_from_data_frame(df_net,directed = T)
plot(net)

Figure of plot(net)

(自动)删除不必要的df行的最佳方法是什么? (我有几个数据行,最多包含100行-因此,不能手动删除行。)

我的第一个想法是使用while循环在每个分层步骤中查找父级。我以为我们可以过滤df中的行。但是我不认为我在正确的轨道上。任何想法表示赞赏!

`%notin%` <- Negate(`%in%`)
i <- nrow(df)
y <- list()
z <- list()
j <- 1
while (i > 0) {
  v <- unique(df$parent[!(df$parent %in% df$child)]) # find mismatch (only in parent, not in child)
  df <- df %>% filter(parent %notin% v)
  print(v)
  y[[j]] <- v
  z[[j]] <- df
  i = nrow(df)
  j = j+1
}
r igraph
2个回答
1
投票

如何使用igraph确定个人之间的路径,如果连接数超过1,则删除该连接?

library(igraph)
parent <- c("A","A","A","B","B","E")
child <-  c("B","C","D","C","D","D")
df <- data.frame(parent,child)
net <- graph_from_data_frame(df,directed = T)
df <- df[apply(df,1,function(x){length(all_simple_paths(net,x[1],x[2]))}) == 1,]
df
  parent child
1      A     B
4      B     C
5      B     D
6      E     D

我担心在非常大的图形上这可能会很慢,因此,如果有人有一个data.table解决方案,那可能会更好。


0
投票

这是dplyr解决方案:

library(dplyr)

# get data
parent <- c("A","A","A","B","B","E")
child <-  c("B","C","D","C","D","D")
df <- data.frame(parent,child, stringsAsFactors = FALSE)

# remove rows that are not directly related
new_df <- anti_join(df,
          left_join(df,df,by=c("child"="parent")) %>% 
  select(parent,child=child.y) %>% 
  na.omit()) 

new_df
  parent child
1      A     B
2      B     C
3      B     D
4      E     D
© www.soinside.com 2019 - 2024. All rights reserved.