我有这样的数据:
g1 g2 var
1 a Yes
1 a No
1 a No
1 b Yes
1 b Yes
1 b Yes
2 a No
2 a No
2 a No
我想将var中的所有值更改为是,如果在每个g1和g2组中,var中至少有一个是。我尝试使用group_by和mutate的组合,替换,ifelse但没有成功。任何帮助表示赞赏。
我们可以使用if/else
而不是ifelse
。由'g1','g2'分组,if
'是'是%in%
'var',然后返回“是”或者返回'var'
library(dplyr)
df1 %>%
group_by(g1, g2) %>%
mutate(var = if("Yes" %in% var) "Yes" else var)
# A tibble: 9 x 3
# Groups: g1, g2 [3]
# g1 g2 var
# <int> <chr> <chr>
#1 1 a Yes
#2 1 a Yes
#3 1 a Yes
#4 1 b Yes
#5 1 b Yes
#6 1 b Yes
#7 2 a No
#8 2 a No
#9 2 a No
或者与case_when
df1 %>%
group_by(g1, g2) %>%
mutate(var = case_when("Yes" %in% var ~ "Yes", TRUE ~ var))
df1 <- structure(list(g1 = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), g2 = c("a",
"a", "a", "b", "b", "b", "a", "a", "a"), var = c("Yes", "No",
"No", "Yes", "Yes", "Yes", "No", "No", "No")), class = "data.frame",
row.names = c(NA, -9L))
你也可以这样做:
df %>%
group_by(g1, g2) %>%
mutate(var = ifelse(any(var == "Yes"), "Yes", "No"))
g1 g2 var
<int> <chr> <chr>
1 1 a Yes
2 1 a Yes
3 1 a Yes
4 1 b Yes
5 1 b Yes
6 1 b Yes
7 2 a No
8 2 a No
9 2 a No
这里,如果“var”中的任何值(每“g1”和“g2”)等于Yes
,则返回Yes
,否则返回No
。
来自上述两个解决方案的额外代码行,但通过创建新列然后删除和重命名使用ifelse
或if_else
:
library(tidyverse)
df %>%
group_by(g1, g2) %>%
mutate(var2 = if_else("Yes" %in% var, "Yes", "No")) %>%
select(-var, var = var2)
结果:
g1 g2 var
<dbl> <chr> <chr>
1 1 a Yes
2 1 a Yes
3 1 a Yes
4 1 b Yes
5 1 b Yes
6 1 b Yes
7 2 a No
8 2 a No
9 2 a No `
一个非case_when if_else方式,很有趣
df1 %>%
group_by(g1,g2) %>%
arrange (g1,g2,var) %>%
mutate(var=last(var))
# arranged alphabetically, var values may be changed to the last value by groups -- Yes in this case
g1 g2 var
<int> <chr> <chr>
1 1 a Yes
2 1 a Yes
3 1 a Yes
4 1 b Yes
5 1 b Yes
6 1 b Yes
7 2 a No
8 2 a No
9 2 a No