在我的数据集中,我的个人有时有 2 个时间点,我想设置这 2 个时间点之间的差异。问题是我有很多列我想应用它。 在我的数据集中,每个人有 1 或 2 行,这是我用 2 列所做的示例:
library(tidyr)
data = data.frame(id = c(1,1,2,2,3,4,5,5,6),
time = c("M0","M3","M0","M3","M0","M0","M0","M3","M0"),
bio1 = c(4.2, 4.8, 4, NA, 3.8, 4.4, 5, 6, 6.1),
bio2 = c(12, 14, 10, 11, NA, 18, 19, 12, 15))
data
data_wide <- data %>%
pivot_wider(names_from="time",
values_from=c("bio1","bio2")) %>%
mutate(diff_bio1 = bio1_M3 - bio1_M0,
diff_bio2 = bio2_M3 - bio2_M0) %>%
select(id, diff_bio1, diff_bio2)
data_wide
我知道使用pivot_wider我们可以在我们创建的每个单独的列上聚合函数,但我没有找到一种方法直接在创建时应用该函数来设置差异。有办法做到吗?
至少使用 Tidyverse 函数,跨列对的重复操作可能比不更广泛地重塑数据的解决方案更棘手:
library(tidyverse)
df = tibble(id = c(1,1,2,2,3,4,5,5,6),
time = c("M0","M3","M0","M3","M0","M0","M0","M3","M0"),
bio1 = c(4.2, 4.8, 4, NA, 3.8, 4.4, 5, 6, 6.1),
bio2 = c(12, 14, 10, 11, NA, 18, 19, 12, 15))
df |>
complete(id, time) |>
arrange(id, time) |>
summarize(across(starts_with("bio"), diff),
.by = id)
#> # A tibble: 6 × 3
#> id bio1 bio2
#> <dbl> <dbl> <dbl>
#> 1 1 0.600 2
#> 2 2 NA 1
#> 3 3 NA NA
#> 4 4 NA NA
#> 5 5 1 -7
#> 6 6 NA NA
创建于 2023-12-06,使用 reprex v2.0.2