如何根据组大小应用条件过滤?

问题描述 投票:0回答:1

我想要根据组大小进行条件过滤。

假设我有一个数据框,看起来像:

data1 <- data.frame(
  ID = c(1, 1, 1, 3, 3, 5, 6),
  town = c("Town A", "Town A", "Town B", "Town A", "Town C", "Town B", "Town A"),
  place = c("A", "B", "A", "B", "C", "A", "B"),
  place1 = c("A", "c", "A", "B", "C", "A", "D"),
  test = c("G", "B", "A", "B", "C", "A", "B"),
  test1 = c("G", "B", "A", "B", "d", "A", "B")

我想为每个 ID 保留一个城镇,基于条件过滤 place == place1 并且如果组大小仍然大于我想要过滤的 test == test1。

我尝试过类似的方法:

data1 %>%group_by(ID) %>% 
  filter(if (n() >= 2) place == place1 else test == test1) %>% 
  filter(n() == 1) %>% 
  ungroup()

但是 ifelse 不起作用,因为组 1 和组 3 丢失了。

r dplyr filter conditional-statements
1个回答
0
投票

按条件对数据进行排序(降序,以便 TRUE 在 FALSE 之前),然后每组切片 1 行:

data1 |>
  arrange(ID, desc(place == place1), desc(test == test1)) |>
  slice(1, .by = ID)
#   ID   town place place1 test test1
# 1  1 Town A     A      A    G     G
# 2  3 Town A     B      B    B     B
# 3  5 Town B     A      A    A     A
# 4  6 Town A     B      D    B     B
© www.soinside.com 2019 - 2024. All rights reserved.