鉴于下面的类和数据结构,我想计算每个连续3小时滑动窗口的计数总和,类似于以下结果:
public class Log {
private int id;
private LocalDateTime timestamp;
private int count;
}
id timestamp count
1 2018-10-10T08:00:00 12
2 2018-10-10T08:30:00 5
3 2018-10-10T08:45:00 7
4 2018-10-10T09:10:00 9
5 2018-10-10T09:50:00 3
6 2018-10-10T10:15:00 8
7 2018-10-10T12:00:00 6
8 2018-10-10T12:30:00 1
9 2018-10-10T12:45:00 2
10 2018-10-10T17:30:00 4
11 2018-10-10T17:35:00 7
日志的时间戳按升序排列,并将每个3小时窗口(可以跨越不同日期)的总计数与第一个记录相加。结果将是:
2018-10-10T08:00:00 ~ 2018-10-10T10:59:00 12+5+7+9+3+8
2018-10-10T08:30:00 ~ 2018-10-10T11:29:00 5+7+9+3+8
2018-10-10T08:45:00 ~ 2018-10-10T11:44:00 7+9+3+8
2018-10-10T09:10:00 ~ 2018-10-10T12:09:00 9+3+8+6
2018-10-10T09:50:00 ~ 2018-10-10T12:09:00 3+8+6+1
2018-10-10T10:15:00 ~ 2018-10-10T13:14:00 8+6+1+2
...
我在下面有一些示例代码,但感觉它不是那么有效(如果有大量的日志),因为每次我必须得到并比较所有日志的过滤时间戳。我怎样才能比较当前的日志直到结束?
var logs = List.of();
logs.stream.map(log -> {
var start = log.getTimeStamp();
var end = log.getTimeStamp().plusHours(3);
var logsWithinWindow = logs.stream().filter(l -> isWithinRange(start, end, l.getTimeStamp()));
return logsWithinWindow.map(Log::getCount).sum();
});
如果您在任何持续时间内计算日志,您可以使用:
int countLogsInDuration(List<Log> logs, LocalDateTime start, LocalDateTime end) {
return logs.stream()
.filter(log -> isWithinRange(log.getTimeStamp(), start, end))
.mapToInt(Log::getCount)
.sum();
}
它依赖于
private static boolean isWithinRange(LocalDateTime logTimestamp, LocalDateTime start, LocalDateTime end) {
// return true or false based on comparison
}
此外,至少在您的情况下,每3小时窗口的计算日志似乎是多余的,因为您的滑动窗口大小为30分钟。因此,您可以计算每30分钟的计数,例如8:00到8:30,然后是8:30到9:00,依此类推。当您的滑动窗口与之前的持续时间重叠时,这将避免冗余计算的计数。