根据 R 中的多个日期列删除数据

问题描述 投票:0回答:1

我有一个数据框,其中包含以下变量:“covid 警报日期”、“vax 剂量 1”至“vax 剂量 5”(这些是疫苗品牌)、“日期剂量 1”至“日期剂量 5”(这些是疫苗接种日期)。 该数据集包括他们曾经收到的所有新冠疫苗,但我真的只想保留“新冠警报日期”之前的疫苗品牌和日期。有没有办法查看每一行的“新冠警报日期”,并只保留该日期之前的疫苗品牌和日期?该日期之后的任何内容都可以更改为 NA。

sample_data <- tibble(`covid alert date` = as.Date(c('2022-01-01','2022-02-01','2022-03-01','2022-04-01','2022-05-01','2022-06-01','2022-07-01','2022-08-01','2022-09-01','2022-10-01')),             
                     `vax dose 1` = c("Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", "Astrazeneca"), 
                     `date dose 1` = as.Date(c('2021-01-01','2021-02-01','2021-03-01','2023-04-01','2023-05-01','2021-06-01','2021-07-01','2021-08-01','2023-09-01','2023-10-01')),
                     `vax dose 2` = c("Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", NA, NA, NA, NA), 
                     `date dose 2` = as.Date(c('2021-04-01','2021-03-01','2021-05-01','2023-06-01','2023-07-01', '2021-08-01', NA, NA, NA, NA)),
                     `vax dose 3` = c("Pfizer", "Moderna", "Pfizer", "Moderna", NA, NA, NA, NA, NA, NA), 
                     `date dose 3` = as.Date(c('2022-04-01','2021-12-01','2021-12-01','2023-12-01', NA, NA, NA, NA, NA, NA)),
                     `vax dose 4` = c("Pfizer", "Moderna", NA, NA, NA, NA, NA, NA, NA, NA), 
                     `date dose 4` = as.Date(c('2022-12-01','2022-12-01', NA, NA, NA, NA, NA, NA, NA, NA)),
                     `vax dose 5` = c("Moderna", NA, NA, NA, NA, NA, NA, NA, NA, NA), 
                     `date dose 5` = as.Date(c('2023-11-01', NA, NA, NA, NA, NA, NA, NA, NA, NA)))

我对使用 R 非常陌生,因此非常感谢任何帮助。我在下面提供了一些示例数据。预先感谢。

r data-manipulation
1个回答
0
投票

如果我理解正确,一种选择是旋转数据,根据您的条件进行过滤,然后将数据旋转回您想要的格式,例如

library(tidyverse)

sample_data <- tibble(`covid alert date` = as.Date(c('2022-01-01','2022-02-01','2022-03-01','2022-04-01','2022-05-01','2022-06-01','2022-07-01','2022-08-01','2022-09-01','2022-10-01')),             
                      `vax dose 1` = c("Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", "Astrazeneca"), 
                      `date dose 1` = as.Date(c('2021-01-01','2021-02-01','2021-03-01','2023-04-01','2023-05-01','2021-06-01','2021-07-01','2021-08-01','2023-09-01','2023-10-01')),
                      `vax dose 2` = c("Astrazeneca", "Moderna", "Pfizer", "Astrazeneca", "Moderna", "Pfizer", NA, NA, NA, NA), 
                      `date dose 2` = as.Date(c('2021-04-01','2021-03-01','2021-05-01','2023-06-01','2023-07-01', '2021-08-01', NA, NA, NA, NA)),
                      `vax dose 3` = c("Pfizer", "Moderna", "Pfizer", "Moderna", NA, NA, NA, NA, NA, NA), 
                      `date dose 3` = as.Date(c('2022-04-01','2021-12-01','2021-12-01','2023-12-01', NA, NA, NA, NA, NA, NA)),
                      `vax dose 4` = c("Pfizer", "Moderna", NA, NA, NA, NA, NA, NA, NA, NA), 
                      `date dose 4` = as.Date(c('2022-12-01','2022-12-01', NA, NA, NA, NA, NA, NA, NA, NA)),
                      `vax dose 5` = c("Moderna", NA, NA, NA, NA, NA, NA, NA, NA, NA), 
                      `date dose 5` = as.Date(c('2023-11-01', NA, NA, NA, NA, NA, NA, NA, NA, NA)))

sample_data %>%
  pivot_longer(contains("vax")) %>%
  filter(if_any(starts_with("date"), ~.x < `covid alert date`)) %>%
  pivot_wider(names_from = name, values_from = value) %>%
  relocate(names(sample_data))
#> # A tibble: 6 × 11
#>   `covid alert date` `vax dose 1` `date dose 1` `vax dose 2` `date dose 2`
#>   <date>             <chr>        <date>        <chr>        <date>       
#> 1 2022-01-01         Astrazeneca  2021-01-01    Astrazeneca  2021-04-01   
#> 2 2022-02-01         Moderna      2021-02-01    Moderna      2021-03-01   
#> 3 2022-03-01         Pfizer       2021-03-01    Pfizer       2021-05-01   
#> 4 2022-06-01         Pfizer       2021-06-01    Pfizer       2021-08-01   
#> 5 2022-07-01         Astrazeneca  2021-07-01    <NA>         NA           
#> 6 2022-08-01         Moderna      2021-08-01    <NA>         NA           
#> # ℹ 6 more variables: `vax dose 3` <chr>, `date dose 3` <date>,
#> #   `vax dose 4` <chr>, `date dose 4` <date>, `vax dose 5` <chr>,
#> #   `date dose 5` <date>

创建于 2024-03-12,使用 reprex v2.1.0

© www.soinside.com 2019 - 2024. All rights reserved.