我正在为写一个for循环的逻辑推理而烦恼。考虑以下两个数据帧。
>> df1
A B C hit_time E
0 bar one small 2019-12-11 19:16:51 1
1 bar one large 2019-12-09 20:21:43 2
2 foo two large 2019-12-11 15:11:24 1
3 bar two small 2019-12-05 16:41:21 2
4 bar two small 2019-12-06 17:31:20 3
5 bar one large 2019-12-03 19:13:06 2
6 bar one small 2019-12-04 18:25:04 1
7 bar two small 2019-12-02 21:45:38 1
8 bar two large 2019-12-08 20:32:44 1
>> df2
X Y Z Phase_One Phase_Two Phase_Three
0 foo one small 2019-12-01 06:18:00 2019-12-01 06:38:00 2019-12-01 06:48:00
1 bar one small 2019-12-01 06:33:00 2019-12-01 06:53:00 2019-12-01 07:03:00
2 foo two large 2019-12-11 15:01:24 2019-12-11 15:21:24 2019-12-11 15:31:24
3 bar two small 2019-12-05 16:31:21 2019-12-05 16:51:21 2019-12-05 17:01:21
4 bar two small 2019-12-06 17:21:20 2019-12-06 17:41:20 2019-12-06 17:51:20
5 bar one large 2019-12-03 19:03:06 2019-12-03 19:23:06 2019-12-03 19:33:06
6 bar one large 2019-12-04 18:15:04 2019-12-04 18:35:04 2019-12-04 18:45:04
7 bar two large 2019-12-02 21:35:38 2019-12-02 21:55:38 2019-12-02 22:05:38
8 bar two large 2019-12-08 20:22:44 2019-12-08 20:42:44 2019-12-08 20:52:44
现在我想在for循环中完成以下任务:
df3 <- df2[, Phase_One_ := df1[df2,on=.(hit_time >= Phase_One, hit_time <= Start_Time), sum(E),by=.EACHI]$V1]
df3 <- df2[, Phase_Two_ := df1[df2,on=.(hit_time >= Start_Time, hit_time <= Phase_Two), sum(E),by=.EACHI]$V1]
df3 <- df2[, Phase_Three_ := df1[df2,on=.(hit_time >= Phase_Two, hit_time <= Phase_Three),sum(E),by=.EACHI]$V1]
df1$C
和 df2$Z
分别。现在我明白for循环的速度很慢,所以有什么建议欢迎提出来,能够根据df2中包含不同时间框架的最后三列,将E列相加到新的数据框架中.先谢谢了。
我为另一个项目写了一个函数,就是这样做的。(拆分数据帧,你可以计算出你想要的东西后,在输出列表中引用正确的表格)
multifilter <- function(data,filterorder){
#error if filter(s) do not exsist
if(any(is.na(match(filterorder,names(data))))==T){stop("Atleast one filter does not exsist")}
newdata <- list(data)
for(i in rev(filterorder)){
newdata <- unlist(lapply(sort(unique(data[,i])), function(x) lapply(newdata, function(y) y[y[,i]==x,])),recursive=F)
}
return(newdata[sapply(newdata,nrow)>=1])
}
你可以这样叫 multifilter(df1,"C")
. 这个函数本身比你的任务更复杂,因为它实际上可以接受多个过滤器,例如 multifilter(df2,"c("Y","Z"))
. 如果你想把多滤波器的输出拼接在一起,你可以使用 do.call(rbind, multi_filter_data)