这是我拥有的数据框的示例:
df = data.frame(matrix(rnorm(84, mean = 0, sd = 1), nrow = 12, ncol = 7), stringsAsFactors = F)
df$Date <- seq(as.Date("2000/01/01"), as.Date("2002/10/01"), by = "quarter")
X1 X2 X3 X4 X5 X6 X7 Date
1 -0.22665838 -0.21435535 -0.9060361 -0.7544181 0.3697487 0.226183639 -0.35333109 2000-01-01
2 0.36459588 -0.92357903 -0.7474181 0.3930116 -0.8483455 0.001053074 -0.11071567 2000-04-01
3 0.32772746 -0.95863346 -0.2461959 0.8573144 -1.4050863 -0.851132640 0.22984387 2000-07-01
4 -1.22891784 0.59263058 -0.3155725 -0.3867662 -0.5893056 -0.246202375 0.97845330 2000-10-01
5 -0.07124602 -0.62971959 -0.1990532 -1.2540578 -0.3347652 1.061019031 -0.99044363 2001-01-01
6 1.01317419 1.18537830 0.6241457 -1.4412657 -0.3241036 0.900829237 0.06419316 2001-04-01
7 0.28590272 -1.25413779 -0.4076524 1.0633591 -0.3921616 -0.231332349 -0.82489456 2001-07-01
8 -0.83591105 0.39544445 -1.1275454 -0.8467141 -0.1827673 0.650371871 0.68155623 2001-10-01
9 -0.14689026 0.76575239 -2.3750439 -0.1958910 0.3578670 0.064873489 0.32252314 2002-01-01
10 1.26846657 -0.04560596 -0.9959704 0.3926218 -1.7770232 1.202433913 -0.05919982 2002-04-01
11 -2.01557623 -0.23142037 0.8722606 -0.1013923 -0.9775133 -1.463026339 -0.72456546 2002-07-01
12 0.30603648 -0.24289366 -1.0580142 0.8721441 2.0560490 1.357803811 0.36357346 2002-10-01
我想做的是在缺失的月份行中添加0(([[请注意,在“假”数据集中,频率是每季度一次,但在实际数据中没有规律性))。理想情况下,我会实现这一目标:
X1 X2 X3 X4 X5 X6 X7 Date
1 -0.22665838 -0.21435535 -0.9060361 -0.7544181 0.3697487 0.226183639 -0.35333109 2000-01-01
2 0 0 0 0 0 0 0 2000-02-01
3 0 0 0 0 0 0 0 2000-03-01
4 0.36459588 -0.92357903 -0.7474181 0.3930116 -0.8483455 0.001053074 -0.11071567 2000-04-01
5 0 0 0 0 0 0 0 2000-05-01
6 0 0 0 0 0 0 0 2000-06-01
7 0.32772746 -0.95863346 -0.2461959 0.8573144 -1.4050863 -0.851132640 0.22984387 2000-07-01
8 0 0 0 0 0 0 0 2000-08-01
9 0 0 0 0 0 0 0 2000-09-01
10 -1.22891784 0.59263058 -0.3155725 -0.3867662 -0.5893056 -0.246202375 0.97845330 2000-10-01
#and so on and so forth
我知道如何仅对两列(日期列和数据列)进行此操作:
df <- df %>% mutate(Date = as.Date(as.character(Date, "%Y-%m-%d")), Month = format(Date, "%m"), Year = format(Date, "%Y")) %>% complete(Month = formatC(1:12, 1, flag=0), nesting(Year)) %>% mutate(Date = if_else(is.na(Date), as.Date(paste(Year, Month, "1", sep="-"), "%Y-%m-%d"), Date))%>% arrange(Date) %>% select(Date, Nameofthecolumn) %>% mutate(Columnname = if_else(is.na(Columname), 0, Columname)) %>% mutate(Columnname = if_else(is.na(Columname), 0, Columname))
基本上,我需要获得每月一次的频率(来自不规则的频率),如果没有月份,我希望df中的所有列都为0。我无法像我一样处理几列。
有人可以帮我吗?
fill
中有一个选项>complete