聚合我们使用pivot_wider创建的列(设置差异)

问题描述 投票:0回答:1

在我的数据集中,我的个人有时有 2 个时间点,我想设置这 2 个时间点之间的差异。问题是我有很多列我想应用它。 在我的数据集中,每个人有 1 或 2 行,这是我用 2 列所做的示例:

library(tidyr)
data = data.frame(id = c(1,1,2,2,3,4,5,5,6),
                  time = c("M0","M3","M0","M3","M0","M0","M0","M3","M0"),
                  bio1 = c(4.2, 4.8, 4, NA, 3.8, 4.4, 5, 6, 6.1),
                  bio2 = c(12, 14, 10, 11, NA, 18, 19, 12, 15))
data 

data_wide <- data %>% 
  pivot_wider(names_from="time",
              values_from=c("bio1","bio2")) %>%
  mutate(diff_bio1 = bio1_M3 - bio1_M0,
         diff_bio2 = bio2_M3 - bio2_M0) %>%
  select(id, diff_bio1, diff_bio2)
data_wide              

我知道使用pivot_wider我们可以在我们创建的每个单独的列上聚合函数,但我没有找到一种方法直接在创建时应用该函数来设置差异。有办法做到吗?

r pivot tidyr
1个回答
0
投票

至少使用 Tidyverse 函数,跨列对的重复操作可能比不更广泛地重塑数据的解决方案更棘手:

library(tidyverse)

df = tibble(id = c(1,1,2,2,3,4,5,5,6),
                  time = c("M0","M3","M0","M3","M0","M0","M0","M3","M0"),
                  bio1 = c(4.2, 4.8, 4, NA, 3.8, 4.4, 5, 6, 6.1),
                  bio2 = c(12, 14, 10, 11, NA, 18, 19, 12, 15))

df |> 
  complete(id, time) |> 
  arrange(id, time) |> 
  summarize(across(starts_with("bio"), diff),
            .by = id)
#> # A tibble: 6 × 3
#>      id   bio1  bio2
#>   <dbl>  <dbl> <dbl>
#> 1     1  0.600     2
#> 2     2 NA         1
#> 3     3 NA        NA
#> 4     4 NA        NA
#> 5     5  1        -7
#> 6     6 NA        NA

创建于 2023-12-06,使用 reprex v2.0.2

© www.soinside.com 2019 - 2024. All rights reserved.