在 R 中使用 ifelse 函数重复先前的值

问题描述 投票:0回答:1

我有一个数据框,其中包含每个条件的点击次数。我想在数据进入时计算每个条件的这些点击的累积总和。我目前正在使用 ifelse() 函数来执行此操作。但是,对于测试的“否”部分,我想重复在前一个“是”部分中创建的值,直到出现下一个“是”。目前我正在使用 NA 创建占位符。

当 ifelse 函数的测试为“否”时,如何重复为上一个“是”创建的值,直到下一个“是”?

我做了一个最小的例子:

clicked <- round(runif(n = 20),0)
condition <- sample(c("Intervention", "Control"), size = 20, replace = T)
df <- data.frame(clicked, condition)

df %>%  select(clicked, condition) %>% group_by(condition) %>% 
  
  mutate(successes.intervention = ifelse(condition == "Intervention", cumsum(clicked), NA),
         N.intervention = ifelse(condition == "Intervention", 1:n(), NA),
         successes.control = ifelse(condition == "Control", cumsum(clicked), NA),
         N.control = ifelse(condition == "Control", 1:n(), NA)))

我希望输出看起来像这样:

  clicked condition    successes.intervention N.intervention successes.control N.control
     <dbl> <chr>                         <dbl>          <int>             <dbl>     <int>
 1       0 Control                           0              0                 0         1
 2       1 Control                           0              0                 1         2
 3       0 Control                           0              0                 1         3
 4       1 Intervention                      1              1                 1         3
 5       0 Control                           1              1                 1         4
 6       0 Intervention                      1              2                 1         4
 7       0 Intervention                      1              3                 1         4
 8       0 Control                           1              3                 1         5
 9       0 Intervention                      1              4                 1         5
10       1 Intervention                      2              5                 1         5 
r if-statement cumsum
1个回答
0
投票

这个怎么样?

library(dplyr)
df %>%
  group_by(condition) %>% 
  mutate(
    data.frame(
      lapply(setNames(unique(df$condition), paste0("successes.", unique(df$condition))),
             function(z) cumsum(condition == z & clicked > 0))
    ),
    across(starts_with("successes"), ~ row_number() - 1L, .names = "N{sub('successes','',.col)}")
  ) %>%
  ungroup()
# # A tibble: 20 × 6
#    clicked condition    successes.intervention successes.control N.intervention N.control
#      <dbl> <chr>                         <int>             <int>          <int>     <int>
#  1       1 Intervention                      1                 0              0         0
#  2       1 Intervention                      2                 0              1         1
#  3       0 Intervention                      2                 0              2         2
#  4       1 Intervention                      3                 0              3         3
#  5       1 Intervention                      4                 0              4         4
#  6       1 Control                           0                 1              0         0
#  7       1 Intervention                      5                 0              5         5
#  8       0 Intervention                      5                 0              6         6
#  9       1 Intervention                      6                 0              7         7
# 10       1 Intervention                      7                 0              8         8
# 11       0 Control                           0                 1              1         1
# 12       1 Control                           0                 2              2         2
# 13       1 Control                           0                 3              3         3
# 14       0 Control                           0                 3              4         4
# 15       0 Intervention                      7                 0              9         9
# 16       1 Control                           0                 4              5         5
# 17       1 Intervention                      8                 0             10        10
# 18       0 Control                           0                 4              6         6
# 19       0 Control                           0                 4              7         7
# 20       1 Control                           0                 5              8         8

演练:

  • lapply(..)
    迭代字符串文字(动态确定)并生成
    list
    ;当转换为
    data.frame
    时,
    mutate
    将动态添加列
  • cumsum(..)
    内部,我们验证
    condition
    是我们要总结的,然后对
    click
    的个数进行累加求和。
  • across
    将迭代所有选定的列并返回行号(组内)减 1;它可以选择根据
    .names
    “glue”字符串重命名列。为此,我选择了已经创建的
    successes.*
    列,因为它们总是分为不同的
    condition
    级别。

数据,以

set.seed(42)
开头以确保可重复性:

set.seed(42)
df <- data.frame(clicked = round(runif(n = 20),0),
                 condition = sample(c("Intervention", "Control"), size = 20, replace = T))
head(df)
#   clicked    condition
# 1       1 Intervention
# 2       1 Intervention
# 3       0 Intervention
# 4       1 Intervention
# 5       1 Intervention
# 6       1      Control
© www.soinside.com 2019 - 2024. All rights reserved.