我有一个数据框,其中包含以下变量:“covid 警报日期”、“vax 剂量 1”至“vax 剂量 5”(这些是疫苗品牌)、“日期剂量 1”至“日期剂量 5”(这些是疫苗接种日期)。 该数据集包括他们曾经收到的所有新冠疫苗,但我真的只想保留“新冠警报日期”之前的疫苗品牌和日期。有没有办法查看每一行的“新冠警报日期”,并只保留该日期之前的疫苗品牌和日期?该日期之后的任何内容都可以更改为 NA。
sample_data <- tibble(`covid alert date` = as.Date(c('2022-01-01','2022-02-01','2022-03-01','2022-04-01','2022-05-01','2022-06-01','2022-07-01','2022-08-01','2022-09-01','2022-10-01')),
`vax dose 1` = c("Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", "Astrazeneca"),
`date dose 1` = as.Date(c('2021-01-01','2021-02-01','2021-03-01','2023-04-01','2023-05-01','2021-06-01','2021-07-01','2021-08-01','2023-09-01','2023-10-01')),
`vax dose 2` = c("Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", NA, NA, NA, NA),
`date dose 2` = as.Date(c('2021-04-01','2021-03-01','2021-05-01','2023-06-01','2023-07-01', '2021-08-01', NA, NA, NA, NA)),
`vax dose 3` = c("Pfizer", "Moderna", "Pfizer", "Moderna", NA, NA, NA, NA, NA, NA),
`date dose 3` = as.Date(c('2022-04-01','2021-12-01','2021-12-01','2023-12-01', NA, NA, NA, NA, NA, NA)),
`vax dose 4` = c("Pfizer", "Moderna", NA, NA, NA, NA, NA, NA, NA, NA),
`date dose 4` = as.Date(c('2022-12-01','2022-12-01', NA, NA, NA, NA, NA, NA, NA, NA)),
`vax dose 5` = c("Moderna", NA, NA, NA, NA, NA, NA, NA, NA, NA),
`date dose 5` = as.Date(c('2023-11-01', NA, NA, NA, NA, NA, NA, NA, NA, NA)))
我对使用 R 非常陌生,因此非常感谢任何帮助。我在下面提供了一些示例数据。预先感谢。
如果我理解正确,一种选择是旋转数据,根据您的条件进行过滤,然后将数据旋转回您想要的格式,例如
library(tidyverse)
sample_data <- tibble(`covid alert date` = as.Date(c('2022-01-01','2022-02-01','2022-03-01','2022-04-01','2022-05-01','2022-06-01','2022-07-01','2022-08-01','2022-09-01','2022-10-01')),
`vax dose 1` = c("Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", "Astrazeneca"),
`date dose 1` = as.Date(c('2021-01-01','2021-02-01','2021-03-01','2023-04-01','2023-05-01','2021-06-01','2021-07-01','2021-08-01','2023-09-01','2023-10-01')),
`vax dose 2` = c("Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", NA, NA, NA, NA),
`date dose 2` = as.Date(c('2021-04-01','2021-03-01','2021-05-01','2023-06-01','2023-07-01', '2021-08-01', NA, NA, NA, NA)),
`vax dose 3` = c("Pfizer", "Moderna", "Pfizer", "Moderna", NA, NA, NA, NA, NA, NA),
`date dose 3` = as.Date(c('2022-04-01','2021-12-01','2021-12-01','2023-12-01', NA, NA, NA, NA, NA, NA)),
`vax dose 4` = c("Pfizer", "Moderna", NA, NA, NA, NA, NA, NA, NA, NA),
`date dose 4` = as.Date(c('2022-12-01','2022-12-01', NA, NA, NA, NA, NA, NA, NA, NA)),
`vax dose 5` = c("Moderna", NA, NA, NA, NA, NA, NA, NA, NA, NA),
`date dose 5` = as.Date(c('2023-11-01', NA, NA, NA, NA, NA, NA, NA, NA, NA)))
sample_data %>%
pivot_longer(contains("vax")) %>%
filter(if_any(starts_with("date"), ~.x < `covid alert date`)) %>%
pivot_wider(names_from = name, values_from = value) %>%
relocate(names(sample_data))
#> # A tibble: 6 × 11
#> `covid alert date` `vax dose 1` `date dose 1` `vax dose 2` `date dose 2`
#> <date> <chr> <date> <chr> <date>
#> 1 2022-01-01 Astrazeneca 2021-01-01 Astrazeneca 2021-04-01
#> 2 2022-02-01 Moderna 2021-02-01 Moderna 2021-03-01
#> 3 2022-03-01 Pfizer 2021-03-01 Pfizer 2021-05-01
#> 4 2022-06-01 Pfizer 2021-06-01 Pfizer 2021-08-01
#> 5 2022-07-01 Astrazeneca 2021-07-01 <NA> NA
#> 6 2022-08-01 Moderna 2021-08-01 <NA> NA
#> # ℹ 6 more variables: `vax dose 3` <chr>, `date dose 3` <date>,
#> # `vax dose 4` <chr>, `date dose 4` <date>, `vax dose 5` <chr>,
#> # `date dose 5` <date>
创建于 2024-03-12,使用 reprex v2.1.0