在时间序列数据中添加缺失的日期[重复]

Question

我有一个从2008年到2020年的随机日期的数据，以及它们的对应值。

Date                    Val
September 16, 2012       32
September 19, 2014       33
January 05, 2008         26
June 07, 2017            02
December 15, 2019        03
May 28, 2020             18

我想把2008年1月1日至2020年3月31日的缺失日期及其对应的数值填成1。

我参考了一些帖子，比如员额1, 员额2 而我却无法在此基础上解决这个问题。我是一个R的初学者。

我正在寻找这样的数据

 Date                    Val
 January 01, 2008        1
 January 02, 2008        1
 January 03, 2008        1
 January 04, 2008        1
 January 05, 2008       26
 ........

Answer 1

使用 tidyr::complete :

library(dplyr)

df %>%
  mutate(Date = as.Date(Date, "%B %d, %Y")) %>%
  tidyr::complete(Date = seq(as.Date('2008-01-01'), as.Date('2020-03-31'), 
                           by = 'day'), fill = list(Val = 1)) %>%
  mutate(Date = format(Date, "%B %d, %Y"))


# A tibble: 4,475 x 2
#   Date               Val
#   <chr>            <dbl>
# 1 January 01, 2008     1
# 2 January 02, 2008     1
# 3 January 03, 2008     1
# 4 January 04, 2008     1
# 5 January 05, 2008    26
# 6 January 06, 2008     1
# 7 January 07, 2008     1
# 8 January 08, 2008     1
# 9 January 09, 2008     1
#10 January 10, 2008     1
# … with 4,465 more rows

数据

df <- structure(list(Date = c("September 16, 2012", "September 19, 2014", 
"January 05, 2008", "June 07, 2017", "December 15, 2019", "May 28, 2020"
), Val = c(32L, 33L, 26L, 2L, 3L, 18L)), class = "data.frame", 
row.names = c(NA, -6L))

Answer 2

我们可以在所需的日期范围内创建数据框，然后将我们的数据框加入其中，并替换所有的 NAs 与1。

library(tidyverse)
days_seq %>% 
  left_join(df) %>% 
  mutate(Val = if_else(is.na(Val), as.integer(1), Val))

Joining, by = "Date"
# A tibble: 4,474 x 2
   Date         Val
   <date>     <int>
 1 2008-01-01     1
 2 2008-01-02     1
 3 2008-01-03     1
 4 2008-01-04     1
 5 2008-01-05    33
 6 2008-01-06     1
 7 2008-01-07     1
 8 2008-01-08     1
 9 2008-01-09     1
10 2008-01-10     1
# ... with 4,464 more rows

数据

days_seq <- tibble(Date = seq(as.Date("2008/01/01"), as.Date("2020/03/31"), "days"))

df <- tibble::tribble(
                   ~Date, ~Val,
        "2012/09/16",  32L,
        "2012/09/19",  33L,
        "2008/01/05",  33L
        ) 
df$Date <- as.Date(df$Date)

在时间序列数据中添加缺失的日期[重复]

问题描述投票：-1回答：1

1个回答

最新问题

在时间序列数据中添加缺失的日期[重复]

问题描述 投票：-1回答：1

1个回答

最新问题

问题描述投票：-1回答：1