在r中的if_else中处理NA

问题描述 投票:2回答:1

我有以下数据集,其中三列包含日期。

library(dplyr)

set.seed(45)

df1 <- data.frame(hire_date = sample(seq(as.Date('1999/01/01'),    as.Date('2000/01/01'), by="week"), 10),
              t1 = sample(seq(as.Date('2000/01/01'), as.Date('2001/01/01'), by="week"), 10),
              t2 = sample(seq(as.Date('2000/01/01'), as.Date('2001/01/01'), by="day"), 10))

#this value is actually unknown
df1[10,2] <- NA

    hire_date         t1         t2
1  1999-08-20 2000-05-13 2000-02-17   
2  1999-04-23 2000-11-11 2000-04-27   
3  1999-03-26 2000-04-15 2000-08-01   
4  1999-05-07 2000-06-03 2000-08-29   
5  1999-04-30 2000-05-27 2000-11-19   
6  1999-04-09 2000-12-30 2000-01-26   
7  1999-03-12 2000-12-23 2000-12-07  
8  1999-06-25 2000-02-12 2000-09-26  
9  1999-02-26 2000-05-06 2000-08-23 
10 1999-01-01       <NA> 2000-03-18 

我想执行if else语句,如果t1或t2和hire_date之间的差值在[395,500]之间,则df1 $ com为1

以下if_else语句几乎让我在那里,但是NA搞砸了。有任何想法吗?

df1$com <- if_else((df1$t1 - df1$hire_date) >= 395 &
               (df1$t1 - df1$hire_date) <= 500, 1,
       if_else((df1$t2 - df1$hire_date) >= 395 &
                (df1$t2 - df1$hire_date) <= 500, 1, 0))
r if-statement dplyr
1个回答
2
投票

您可以使用dplyr::case_when而不是嵌套if_else语句。它将让您轻松控制如何治疗NA。而dplyr::between也将为你的日期比较清理事情。

df1 %>%
  mutate(com = case_when(
    is.na(t1) | is.na(t2) ~ 999, # or however you want to treat NA cases
    between(t1 - hire_date, 395, 500) ~ 1,
    between(t2 - hire_date, 395, 500) ~ 1,
    TRUE ~ 0 # neither range is between 395 and 500
  ))

#>     hire_date         t1         t2 com
#> 1  1999-08-20 2000-05-13 2000-02-17   0
#> 2  1999-04-23 2000-11-11 2000-04-27   0
#> 3  1999-03-26 2000-04-15 2000-08-01   1
#> 4  1999-05-07 2000-06-03 2000-08-29   1
#> 5  1999-04-30 2000-05-27 2000-11-19   0
#> 6  1999-04-09 2000-12-30 2000-01-26   0
#> 7  1999-03-12 2000-12-23 2000-12-07   0
#> 8  1999-06-25 2000-02-12 2000-09-26   1
#> 9  1999-02-26 2000-05-06 2000-08-23   1
#> 10 1999-01-01       <NA> 2000-03-18 999
© www.soinside.com 2019 - 2024. All rights reserved.