每当第一列采用该值并将结果保存在新变量中时,我想计算 R 数据帧行中的连续出现次数。如果这是我的数据并且我对一个值感兴趣:
df <- data.frame(
A = c(1, 0, 1, 0, 1),
B = c(0, 0, 1, 1, 1),
C = c(1, 1, 0, 1, 1),
D = c(0, 1, 0, 0, 1),
E = c(1, 0, 1, 0, 0)
)
我想创建以下输出:
df <- data.frame(
A = c(1, 0, 1, 0, 1),
B = c(0, 0, 1, 1, 1),
C = c(1, 1, 0, 1, 1),
D = c(0, 1, 0, 0, 1),
E = c(1, 0, 1, 0, 0),
count = c(1, 0, 2, 0, 4)
)
我试过这样的事情,但不确定这是否明智:
df$count <- apply(df[, sapply(df, is.numeric)], 1, function(x) {
r <- rle(x == 1)
max(r$lengths[r$values])
})
而且它还没有考虑到我对从第一列开始的法术感兴趣。 非常感谢任何帮助!
library(tidyverse)
df <- tibble(
A = c(1, 0, 1, 0, 1),
B = c(0, 0, 1, 1, 1),
C = c(1, 1, 0, 1, 1),
D = c(0, 1, 0, 0, 1),
E = c(1, 0, 1, 0, 0)
)
df |>
rowwise() |>
mutate(count = if_else(A ==1,
sum(consecutive_id(c_across(everything())) == 1),
0))
#> # A tibble: 5 × 6
#> # Rowwise:
#> A B C D E count
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0 1 0 1 1
#> 2 0 0 1 1 0 0
#> 3 1 1 0 0 1 2
#> 4 0 1 1 0 0 0
#> 5 1 1 1 1 0 4
创建于 2023-03-22 与 reprex v2.0.2
使用
match()
代替 consecutive_id()
的替代解决方案:
df |>
rowwise() |>
mutate(count = match(FALSE, c_across(everything()) == 1) - 1)
使用
rle
cbind(df, count = apply(df, 1, function(x)
ifelse(x[1] == 1, max(rle(x)$lengths), 0)))
A B C D E count
1 1 0 1 0 1 1
2 0 0 1 1 0 0
3 1 1 0 0 1 2
4 0 1 1 0 0 0
5 1 1 1 1 0 4
另一种使用游程编码(
rle
)和“整洁”符号的方法:
df |>
rowwise() |>
mutate(first_runlength = unlist(rle(c_across(A:E)))[1],
first_runlength = ifelse(A, first_runlength, 0)
)