用零填充内部 NA

问题描述 投票:0回答:5

我的数据框有一个日期列和两列数值,包括一些

NA
s,像这样:

df
#          Date  a  b
# 1  1990-02-01 NA NA
# 2  1990-03-01 NA NA
# 3  1990-04-01 NA  3
# 4  1990-05-01  1  4
# 5  1990-06-01  2  5
# 6  1990-07-01  3 NA
# 7  1990-08-01  4  7
# 8  1990-09-01  5 NA
# 9  1990-10-01  6  9
# 10 1990-11-01  7 NA
# 11 1990-12-01  8 NA
# 12 1991-01-01  9 NA
# 13 1991-02-01 10 13
# 14 1991-03-01 11 14
# 15 1991-04-01 12 15
# 16 1991-05-01 13 NA

我想保留时间序列开始之前的 NA,然后用零替换它们。最终结果应该是这样的:

finaldf
#          Date  a  b
# 1  1990-02-01 NA NA
# 2  1990-03-01 NA NA
# 3  1990-04-01 NA  3
# 4  1990-05-01  1  4
# 5  1990-06-01  2  5
# 6  1990-07-01  3  0
# 7  1990-08-01  4  7
# 8  1990-09-01  5  0
# 9  1990-10-01  6  9
# 10 1990-11-01  7  0
# 11 1990-12-01  8  0
# 12 1991-01-01  9  0
# 13 1991-02-01 10 13
# 14 1991-03-01 11 14
# 15 1991-04-01 12 15
# 16 1991-05-01 13  0

是否有一些来自一些很酷的包的替换/填充功能可以做到这一点?或者你会如何自己解决这个问题?

数据

df <- data.frame(Date=seq(lubridate::ymd('1990-02-01'), lubridate::ymd('1991-05-01'), by='1 month'), 
                 a=c(rep(NA, 3), 1:13), 
                 b=c(NA, NA, 3, 4, 5, NA, 7, NA, 9, NA, NA, NA, 13, 14, 15, NA))

finaldf <- data.frame(Date=seq(lubridate::ymd('1990-02-01'), lubridate::ymd('1991-05-01'), by='1 month'), 
                      a=c(rep(NA, 3), 1:13), 
                      b=c(NA, NA, 3, 4, 5, 0, 7, 0, 9, 0, 0, 0, 13, 14, 15, 0))
r replace na fill
5个回答
2
投票

你可以考虑这样的事情:

df <- data.frame(Date = seq(ymd('1990-02-01'),ymd('1991-05-01'), by = '1 month'), 
                 a = c(rep(NA,3),1:13), 
                 b = c(NA,NA,3,4,5,NA,7,NA,9,NA,NA,NA,13,14,15,NA) )

df$b <- ifelse(is.na(df$b) & (df$Date > "1990-04-01"), 0, df$b)

df
         Date  a  b
1  1990-02-01 NA NA
2  1990-03-01 NA NA
3  1990-04-01 NA  3
4  1990-05-01  1  4
5  1990-06-01  2  5
6  1990-07-01  3  0
7  1990-08-01  4  7
8  1990-09-01  5  0
9  1990-10-01  6  9
10 1990-11-01  7  0
11 1990-12-01  8  0
12 1991-01-01  9  0
13 1991-02-01 10 13
14 1991-03-01 11 14
15 1991-04-01 12 15
16 1991-05-01 13  0

2
投票

我们可以通过

across
检查列
a
b
结合
ifelse
语句来做到这一点:

library(dplyr)

df %>% 
  mutate(across(c(a, b), ~ifelse(Date > Date[4] & is.na(.), 0, .)))
 #mutate(across(c(a, b), ~ifelse(Date > Date[a==1] & is.na(.), 0, .))) # more general
 Date           a     b
   <date>     <int> <dbl>
 1 1990-02-01    NA    NA
 2 1990-03-01    NA    NA
 3 1990-04-01    NA     3
 4 1990-05-01     1     4
 5 1990-06-01     2     5
 6 1990-07-01     3     0
 7 1990-08-01     4     7
 8 1990-09-01     5     0
 9 1990-10-01     6     9
10 1990-11-01     7     0
11 1990-12-01     8     0
12 1991-01-01     9     0
13 1991-02-01    10    13
14 1991-03-01    11    14
15 1991-04-01    12    15
16 1991-05-01    13     0

2
投票

这很像 TarJae 的回答,但稍微更有活力:

library(dplyr)

df %>% 
  mutate(across(c(a, b), ~ifelse(cumsum(!is.na(.)) > 0 & is.na(.), 0, .)))

返回

         Date  a  b
1  1990-02-01 NA NA
2  1990-03-01 NA NA
3  1990-04-01 NA  3
4  1990-05-01  1  4
5  1990-06-01  2  5
6  1990-07-01  3  0
7  1990-08-01  4  7
8  1990-09-01  5  0
9  1990-10-01  6  9
10 1990-11-01  7  0
11 1990-12-01  8  0
12 1991-01-01  9  0
13 1991-02-01 10 13
14 1991-03-01 11 14
15 1991-04-01 12 15
16 1991-05-01 13  0

2
投票

我们可以

replace
which.min
which.max
之间,不需要包裹。

u <- which.min(df$b):which.max(df$b)
df$b[u] <- replace(df$b[u], is.na(df$b[u]), 0)
df
#          Date  a  b
# 1  1990-02-01 NA NA
# 2  1990-03-01 NA NA
# 3  1990-04-01 NA  3
# 4  1990-05-01  1  4
# 5  1990-06-01  2  5
# 6  1990-07-01  3  0
# 7  1990-08-01  4  7
# 8  1990-09-01  5  0
# 9  1990-10-01  6  9
# 10 1990-11-01  7  0
# 11 1990-12-01  8  0
# 12 1991-01-01  9  0
# 13 1991-02-01 10 13
# 14 1991-03-01 11 14
# 15 1991-04-01 12 15
# 16 1991-05-01 13 NA

资料:

df <- structure(list(Date = structure(c(7336, 7364, 7395, 7425, 7456, 
7486, 7517, 7548, 7578, 7609, 7639, 7670, 7701, 7729, 7760, 7790
), class = "Date"), a = c(NA, NA, NA, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 11L, 12L, 13L), b = c(NA, NA, 3, 4, 5, NA, 7, 
NA, 9, NA, NA, NA, 13, 14, 15, NA)), class = "data.frame", row.names = c(NA, 
-16L))

0
投票

zoo::na.fill
使用 3 个元素进行第二个参数,这些元素用于填充前导、内部和尾随 NA,所以:

library(zoo)

replace(df, -1, na.fill(df[-1], c(NA, 0, 0)))

df[-1] <- na.fill(df[-1], c(NA, 0, 0))
© www.soinside.com 2019 - 2024. All rights reserved.