在下面的示例中,我尝试创建一个“输出”列,其中对于每个 ID,当 B == apple 时,对于优于或等于 A 值的所有 A 值,输出 == 1(否则为 0) .
df1 <- data.frame(ID = c("a", "a", "a", "b", "b", "b", "b", "c", "c", "c"), A = c(2, 1, 8, 4, 3, 12, 9, 142, 13, 8), B = c("apple"
, "orange", "kiwi", "orange", "apple", "kiwi", "pear", "kiwi", "apple", "orange"), output = c(1, 0, 1, 1, 1, 1, 1, 1, 1, 0))
df1
ID A B output
1 a 2 apple 1
2 a 1 orange 0
3 a 8 kiwi 1
4 b 4 orange 1
5 b 3 apple 1
6 b 12 kiwi 1
7 b 9 pear 1
8 c 142 kiwi 1
9 c 13 apple 1
10 c 8 orange 0
我能想到的最好的办法是使用base-R
df1$A >= df1$A[df1$B== "apple" & df1$ID == "a"]
,但我无法弄清楚这里要遵循的逻辑...
理想情况下,我正在寻找 tidyverse 解决方案,但 base-R 解决方案也可以。
提前非常感谢!
尝试
split
> split(df1, ~ID) |>
+ lapply(\(x) transform(x, out1=+with(x, A >= A[B == 'apple']))) |>
+ do.call(what='rbind')
ID A B output out1
a.1 a 2 apple 1 1
a.2 a 1 orange 0 0
a.3 a 8 kiwi 1 1
b.4 b 4 orange 1 1
b.5 b 3 apple 1 1
b.6 b 12 kiwi 1 1
b.7 b 9 pear 1 1
c.8 c 142 kiwi 1 1
c.9 c 13 apple 1 1
c.10 c 8 orange 0 0