连续计数并指定变量的出现次数

问题描述 投票:0回答:1

我希望计算连续出现的任何值,并将该值分配给下一列中的该值。下面是输入和所需输出的示例:

dataset <- data.frame(input = c("a","b","b","a","a","c","a","a","a","a","b","c"))
dataset$count <- c(1,2,2,2,2,1,4,4,4,4,1,1)

dataset  
   input   count
     a       1
     b       2
     b       2
     a       2
     a       2
     c       1
     a       4
     a       4
     a       4
     a       4
     b       1
     c       1

使用rle(dataset$input),我可以获取每个值的出现次数。但是我想要以上格式的结果输出。

我的问题类似于:R: count consecutive occurrences of values in a single column但是这里的输出是顺序的,我想将计数本身分配给该值。

r count find-occurrences
1个回答
0
投票

您可以在lengths中重复lengths参数rle时间

with(rle(dataset$input), rep(lengths, lengths))
#[1] 1 2 2 2 2 1 4 4 4 4 1 1

使用dplyr,我们可以使用lag

dataset %>%
  group_by(gr = cumsum(input != lag(input, default = first(input)))) %>%
  mutate(count = n())

以及data.table

setDT(dataset)[, count:= .N, rleid(input)]

数据

确保input列是字符而不是factor

dataset <- data.frame(input = c("a","b","b","a","a","c","a","a","a","a","b","c"),
           stringsAsFactors = FALSE)
© www.soinside.com 2019 - 2024. All rights reserved.