如何根据条件重置列中学位学时的累计累计值

问题描述 投票:0回答:2

我根据具体条件计算了

index_new
列。但是,当变量
zero
超过 5 时,我面临着将计算重置为
dry_hours
的问题。相比之下,我已经能够在 Excel 中实现所需的重置。这些只是累积学位-小时的计算。

这是我的数据一瞥

如您所见,当

index_new
处的干燥时间超过五小时时,
rows 36-37
列中的计算并没有重置为零,但在Excel计算的指数列中却重置为零。两列应该匹配。

这是我的代码:

base_temperature <- 44

df <- df %>%
 mutate(dry_hours = ifelse(lwd== 0, sequence(rle(lwd == 0)$lengths), 0)) %>%
  mutate(zero_index = lwd == 0 | dry_hours > 5 | temp < 44 | temp > 86) %>%
  group_by(event) %>%
  mutate(index_new = cumsum(ifelse(zero_index, 0,  temp - base_temperature))) %>%
  select(-zero_index) %>%
  relocate(index, .before = index_new)

这是可重现的示例


df <- structure(list(event = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 
 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
 2, 2, 2, 2, 2, 2), lwd = c(1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 
 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 
 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
 1, 1, 1, 1, 1, 1), temp = c(40, 41, 42, 43, 44, 45, 46, 47, 48, 
 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 
 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 
 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 40, 41, 42, 43, 44, 45, 
 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 
 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 
 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90), dry_hours = c(0, 
 0, 0, 1, 2, 3, 4, 5, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 
 3, 4, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0, 0, 0, 
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 0, 0, 0, 0, 
 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), index = c(0, 
 0, 0, 0, 0, 0, 0, 0, 0, 5, 11, 18, 26, 35, 45, 56, 68, 81, 95, 
 110, 110, 110, 110, 110, 130, 151, 173, 196, 220, 245, 245, 245, 
 245, 245, 245, 0, 0, 33, 67, 102, 138, 175, 213, 252, 292, 333, 
 375, 375, 375, 375, 375, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 11, 18, 
 26, 35, 45, 56, 68, 81, 95, 110, 110, 110, 110, 110, 130, 151, 
 173, 196, 220, 245, 271, 298, 326, 355, 385, 416, 448, 481, 515, 
 550, 586, 623, 661, 700, 740, 781, 823, 823, 823, 823, 823), 
 index_new = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 11, 18, 26, 35, 
 45, 56, 68, 81, 95, 110, 110, 110, 110, 110, 130, 151, 173, 
 196, 220, 245, 245, 245, 245, 245, 245, 245, 245, 278, 312, 
 347, 383, 420, 458, 497, 537, 578, 620, 620, 620, 620, 620, 
 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 11, 18, 26, 35, 45, 56, 68, 
 81, 95, 110, 110, 110, 110, 110, 130, 151, 173, 196, 220, 
 245, 271, 298, 326, 355, 385, 416, 448, 481, 515, 550, 586, 
 623, 661, 700, 740, 781, 823, 823, 823, 823, 823)), 
 class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -102L),
 groups = structure(list(event = c(1, 2), .rows = structure(list(1:51, 52:102),
 ptype = integer(0), class = c("vctrs_list_of", 
 "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
 ), row.names = c(NA, -2L), .drop = TRUE))

这是我用来计算

index
列的 Excel 代码,似乎工作正常

=IF(D3>5,0,IF(B3<1,0,IF(C3<44,0,IF(C3>86,0,C3-44)))+E2)
r dataframe dplyr data-manipulation data-munging
2个回答
0
投票

如果您希望计算按

event
分组并在
dry_hours
超过 5 时重置,则需要在分组中添加
dry_hours
超过 5 时的计数。将
group_by(event)
更改为
group_by(event, cumsum(dry_hours > 5))
:

df %>%
 mutate(dry_hours = ifelse(lwd== 0, sequence(rle(lwd == 0)$lengths), 0)) %>%
  mutate(zero_index = lwd == 0 | dry_hours > 5 | temp < 44 | temp > 86) %>%
  group_by(event, cumsum(dry_hours > 5)) %>%
  mutate(index_new = cumsum(ifelse(zero_index, 0,  temp - base_temperature))) %>%
  select(-zero_index) %>%
  relocate(index, .before = index_new) |>
  ungroup() |>
  filter(index != index_new) ## keep only rows that do not match
# A tibble: 0 × 7
# ℹ 7 variables: event <dbl>, lwd <dbl>, temp <dbl>, dry_hours <dbl>, index <dbl>, index_new <dbl>,
#   cumsum(dry_hours > 5) <int>

## all rows match!

0
投票

dry_hours
每当
lwd == 0
时就开始计数,对吧?因此,如果
lwd == 0
,则
zero_index
对于任何
dry_hours
都是正确的。也许您需要回顾一下:

  mutate(zero_index = lwd == 0 | dry_hours > 5 | temp < 44 | temp > 86)
© www.soinside.com 2019 - 2024. All rights reserved.