我在 R 中有以下数据框:
friendships <- structure(list(person = c("person3", "person10", "person2", "person6",
"person4", "person6", "person10", "person5", "person3", "person9",
"person9", "person9", "person3", "person8", "person10", "person7",
"person10"),
friend = c("person9", "person4", "person1", "person7",
"person10", "person7", "person9", "person10", "person7", "person5",
"person7", "person5", "person6", "person9", "person2", "person5",
"person8")),
row.names = c(1L, 3L, 4L, 5L, 7L, 8L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L),
class = "data.frame")
对于person10,我想找出level=2的邻居的名字和数量。
在igraph中,我可以这样做:
library(igraph)
g <- graph_from_data_frame(friendships, directed = FALSE)
#names of neighbors at degree=2
> ego(g, 2, "person10")
[[1]]
+ 9/10 vertices, named, from 6632700:
[1] person10 person2 person4 person5 person9 person8 person1 person7 person3
#number of neighbors at degree=2
> length(unlist(ego(g, 2, "person10")))
[1] 9
我的问题:现在,我只想使用 SQL 代码和 R 函数/循环(即没有 igraph)来完成此操作。
这是我的尝试:
library(sqldf)
#first locate neighbors of person10
friends <- sqldf("SELECT friend FROM friendships WHERE person = 'person10'")
# nxt, initialize the list of friends with degree 2
friends_degree2 <- friends
从这里开始,我尝试编写一个循环:
while (TRUE) {
new_friends <- sqldf(paste0("SELECT friend FROM friendships WHERE person IN ('", paste(friends$friend, collapse="','"), "')"))
new_friends <- unique(new_friends)
if (all(new_friends$friend %in% friends_degree2$friend)) {
break
}
friends_degree2 <- unique(rbind(friends_degree2, new_friends))
friends <- new_friends
}
From here, we can check the answers:
> print(nrow(friends_degree2))
[1] 8
>
> print(friends_degree2)
friend
1 person4
2 person9
3 person2
4 person8
11 person1
21 person10
31 person5
41 person7
但是,这个列表中没有person3?
有人可以告诉我如何解决这个问题吗?
这里的问题是边缘的方向。在您的数据框中,边缘从人到朋友,但不是从朋友回到人。在 igraph 中,这是使用
directed=False
进行管理的。
第 3 个人在图中有 3 个边,但它们都是“出”边。第三个人没有出现在列表的第二个向量中。为了使你的图无向,你必须为每对包含一个反向边。这可以通过以下方式实现:
friendships <- rbind(friendships, friendships[:,c(2,1)])
上面的代码只是添加了一组反向边,从每个出节点到每个入节点