近似误差,需要至少两个非NA值进行插值

问题描述 投票:0回答:2

我正在尝试使用roxofun通过插值法计算缺失值:

column_name <- colnames(vndusd_merged);

lapply(column_name, function(x){
  if(x != "Date"){ 
    interpl <- approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)], vndusd_merged$x[!is.na(vndusd_merged$x)]);
    vndusd_merged$x <- interpl(vndusd_merged$Date);  
  }
})

我一直收到此错误:

Error in approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)], vndusd_merged$x[!is.na(vndusd_merged$x)]) : 
  need at least two non-NA values to interpolate 
4.
stop("need at least two non-NA values to interpolate") 
3.
approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)], vndusd_merged$x[!is.na(vndusd_merged$x)]) 
2.
FUN(X[[i]], ...) 
1.
lapply(column_name, function(x) {
    if (x != "Date") {
        interpl <- approxfun(vndusd_merged$Date[!is.na(vndusd_merged$x)], 
            vndusd_merged$x[!is.na(vndusd_merged$x)]) ... 

这里是vndusd_merged的前20行的样本。 “日期”列没有任何N / A

         Date Ask.Close Bid.Close
1  01/01/2014     21115     21075
2  02/01/2014     21160     21060
3  03/01/2014     21115     21075
4  04/01/2014        NA        NA
5  05/01/2014        NA        NA
6  06/01/2014     21120     21080
7  07/01/2014     21115     21075
8  08/01/2014     21120     21080
9  09/01/2014     21115     21075
10 10/01/2014     21110     21072
11 11/01/2014        NA        NA
12 12/01/2014        NA        NA
13 13/01/2014     21120     21060
14 14/01/2014     21110     21072
15 15/01/2014     21110     21070
16 16/01/2014     21120     21080
17 17/01/2014     21110     21070
18 18/01/2014        NA        NA
19 19/01/2014        NA        NA
20 20/01/2014     21110     21070

我尝试通过手动插入列名来运行它,但仍然出现相同的错误。

interpl <- aproxfun(vndusd_merged$Date[!is.na(vndusd_merged$Ask.Close)], vndusd_merged$Ask.Close[!is.na(vndusd_merged$Ask.Close)]);

我该如何解决这个问题?

r
2个回答
1
投票

以下代码完成了问题的要求。

vndusd_merged$Date <- as.Date(vndusd_merged$Date, "%d/%m/%Y")

vndusd_merged[-1] <- lapply(vndusd_merged[-1], function(x){
  i <- !is.na(x)
  f <- approxfun(vndusd_merged$Date[i], x[i])
  y <- f(vndusd_merged$Date)
  y
})

vndusd_merged
#         Date Ask.Close Bid.Close
#1  2014-01-01  21115.00  21075.00
#2  2014-01-02  21160.00  21060.00
#3  2014-01-03  21115.00  21075.00
#4  2014-01-04  21116.67  21076.67
#5  2014-01-05  21118.33  21078.33
#6  2014-01-06  21120.00  21080.00
#7  2014-01-07  21115.00  21075.00
#8  2014-01-08  21120.00  21080.00
#9  2014-01-09  21115.00  21075.00
#10 2014-01-10  21110.00  21072.00
#11 2014-01-11  21113.33  21068.00
#12 2014-01-12  21116.67  21064.00
#13 2014-01-13  21120.00  21060.00
#14 2014-01-14  21110.00  21072.00
#15 2014-01-15  21110.00  21070.00
#16 2014-01-16  21120.00  21080.00
#17 2014-01-17  21110.00  21070.00
#18 2014-01-18  21110.00  21070.00
#19 2014-01-19  21110.00  21070.00
#20 2014-01-20  21110.00  21070.00

如果要使用列名向量,在这种情况下不等于"Date",请使用上面的代码,但将其应用于其他子数据帧。

column_name <- colnames(vndusd_merged)
column_name <- column_name[column_name != "Date"]

vndusd_merged[column_name] <- lapply(vndusd_merged[column_name], function(x){
  #same code as above
})

0
投票

您可以使用approx做得更简洁一些。

ip <- sapply(vndusd_merged[-1], function(x) with(vndusd_merged, approx(Date, x, xout=Date)$y))
cbind(vndusd_merged[1], ip)
#          Date Ask.Close Bid.Close
# 1  01/01/2014  21115.00  21075.00
# 2  02/01/2014  21160.00  21060.00
# 3  03/01/2014  21115.00  21075.00
# 4  04/01/2014  21116.67  21076.67
# 5  05/01/2014  21118.33  21078.33
# 6  06/01/2014  21120.00  21080.00
# 7  07/01/2014  21115.00  21075.00
# 8  08/01/2014  21120.00  21080.00
# 9  09/01/2014  21115.00  21075.00
# 10 10/01/2014  21110.00  21072.00
# 11 11/01/2014  21113.33  21068.00
# 12 12/01/2014  21116.67  21064.00
# 13 13/01/2014  21120.00  21060.00
# 14 14/01/2014  21110.00  21072.00
# 15 15/01/2014  21110.00  21070.00
# 16 16/01/2014  21120.00  21080.00
# 17 17/01/2014  21110.00  21070.00
# 18 18/01/2014  21110.00  21070.00
# 19 19/01/2014  21110.00  21070.00
# 20 20/01/2014  21110.00  21070.00
© www.soinside.com 2019 - 2024. All rights reserved.