仅识别非重复行

问题描述 投票:2回答:1

我有一个包含许多重复行的数据集,我只想隔离非重复值。我的df看起来像这样

df <- data.frame("group" = c("A", "A", "A","A","A","B","B","B"), 
                    "id" = c("id1", "id2", "id3", "id1", "id2","id1","id2","id1"), 
                    "Val" = c(10,10,10,10,10,12,12,12))

我想提取的只是没有重复的行。即我的最终数据集应该看起来像这样

final <- data.frame("group" = c("A","B"), 
                 "id" = c("id3","id2"), 
                 "Val" = c(10,12))

请注意,我对寻找唯一值不感兴趣,而对重复值不感兴趣。我知道如何查找唯一值,例如df %>% distinct()可以完成工作。它正在使我正在努力的非重复行个性化

r unique rows data-manipulation
1个回答
4
投票

这里是一种选择。

library(dplyr)
df %>%
   group_by(group) %>% 
   filter(!(duplicated(id)|duplicated(id, fromLast = TRUE)))

或单独使用dplyr

df %>% 
     group_by_all %>%
     filter(n() ==1)

或使用base R

df[!(duplicated(df[1:2])|duplicated(df[1:2], fromLast = TRUE)),]
© www.soinside.com 2019 - 2024. All rights reserved.