R中的连续值和新的因子水平

问题描述 投票:1回答:2

我有以下样本

id <- c("a","b","a","b","a","a","a","a","b","b","c")
SOG <- c(4,4,0,0,0,0,0,0,0,0,9)
data <- data.frame(id,SOG)

我想在新列中使用以下代码时SOG == 0时的累积值

tmp <- rle(SOG)                                    #run length encoding: 
tmp$values <- tmp$values == 0                      #turn values into logicals 
tmp$values[tmp$values] <- cumsum(tmp$values[tmp$values]) #cumulative sum of TRUE values 
inverse.rle(tmp)                                   #inverse the run length encoding 

我创建了“停止”列:

data$Stops <- inverse.rle(tmp)

我可以进去:

[1] 0 0 1 1 1 1 1 1 1 1 0

但我想改为

[1] 0 0 1 2 3 3 3 3 4 4 0 

我的意思是当因子“id”的级别与前一行不同时,我想跳到下一个“停止”(i + 1)。

r r-factor
2个回答
1
投票

我们可以尝试

library(data.table)
setDT(data1)[, v1 := if(all(!SOG)) c(TRUE, id[-1]!= id[-.N]) else
     rep(FALSE, .N), .(grp = rleid(SOG))][,cumsum(v1)*(!SOG)]
#[1] 0 0 1 2 3 3 3 3 4 4 0 0 0 0 5 5 0 6 6 0

使用旧数据

setDT(data)[, v1 := if(all(!SOG)) c(TRUE, id[-1]!= id[-.N]) 
       else rep(FALSE, .N), .(grp = rleid(SOG))][,cumsum(v1)*(!SOG)]
#[1] 0 0 1 2 3 3 3 3 4 4 0

data

id <- c("a","b","a","b","a","a","a","a","b","b","c","a","a","a","a","a","a","a","a", "a")
SOG <- c(4,4,0,0,0,0,0,0,0,0,9,1,5,3,0,0,4,0,0,1)
data1 <- data.frame(id, SOG, stringsAsFactors=FALSE)

4
投票

看看dplyr

library(dplyr)
data %>%
  mutate(
    Stops = ifelse(
      SOG > 0,
      0,
      cumsum(SOG == 0 & lag(id) != id)
    )
  )
© www.soinside.com 2019 - 2024. All rights reserved.