如何获得R中唯一的组合组？

Question

我正在对一些数据进行分组，例如实体数据。我已经根据一些实体属性找到了组，如下所示：

df <- data.frame(uniq_index.x = c(1426, 1426, 1426, 1426, 7796, 7796, 7796, 7796, 
                                  7159, 7159, 7159, 7159, 7857, 7857, 7857, 7857,
                                  7158, 7158, 7158, 7158, 5440, 9861, 1641, 8685,
                                  1644, 7525, 6030, 5672), 
                 uniq_index.y = c(7796, 7159, 7857, 7158, 1426, 7159, 7857, 7158,
                                  1426, 7796, 7857, 7158, 1426, 7796, 7159, 7158,
                                  1426, 7796, 7159, 7857, 9861, 5440, 8685, 1641,
                                  7525, 1644, 5673, 6030)
                 )

# grouping
a <- df %>% 
  group_by(uniq_index.x) %>% 
  group_split

从上面的数据来看，“1426”、“7796”、“7159”、“7877”和“7158”应该在同一组； 5672、5673 和 6030 应该在另一组中。我可以使用

group_by

和

group_split

来分组。

但是，由于存在重复的组，我使用以下代码来获取唯一的组：

# initial an empty dataframe
b <- data.frame(V1 = character())
# loop through a (which is obtained from group_split)
for (i in 1:length(a)) {
  x <- a[[i]][,1]
  y <- a[[i]][,2]
  x <- x %>% 
    mutate(uniq_index = uniq_index.x) %>% 
    select(uniq_index)
  y <- y %>% 
    mutate(uniq_index = uniq_index.y) %>%
    select(uniq_index)
  z <- unique(x) %>% 
    rbind(y) %>% 
    arrange(uniq_index)
  b <- b %>% 
    rbind(paste(z))
}
  
# unique groups
b <- b %>% 
  unique() %>% 
  mutate(
    uniq_agency_id = 100000 + 1:nrow(unique(b))
  )

然后，我注意到这个问题：

与样本数据类似，(6030, 5672) 和 (5673, 6030) 是两个独立的组。这两组应该在一个大组中。

我正在努力想出一种逻辑来获得组合的独特群体。

Answer 1

这个问题的解决方案在这个网站上随处可见。这是使用

igraph

的一种方法：

igraph::components(igraph::graph_from_data_frame(df,FALSE))$membership
1426 7796 7159 7857 7158 5440 9861 1641 8685 1644 7525 6030 5672 5673 
   1    1    1    1    1    2    2    3    3    4    4    5    5    5

如何获得R中唯一的组合组？

问题描述投票：0回答：1

1个回答

最新问题

如何获得R中唯一的组合组？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1