我有一个使用Spring Cloud Streams和Kafka Streams构建的流处理应用程序,该系统从应用程序中获取日志,并将其与另一个流处理器的观察结果进行比较并产生一个分数,然后将日志流除以该分数(高于和低于某个阈值)。
拓扑:
问题:
所以我的问题是如何正确实施“对数最佳观察选择器处理器”,在处理日志时,观测值有限,但是可能有很多观测值。
所以我想出了两种解决方案...
Group&Window通过日志ID获得日志评分的主题,然后减少以获取最高分。 (问题:对所有观察结果评分可能要花费比窗口更长的时间)
在每次计分后发送一个计分完成的消息,与日志相关观察合并,使用日志计分观察全局表和交互式查询来检查所有观察ID是否都在全局表存储中(当所有ID都在其中时)商店会映射到得分最高的观察值。 (问题:仅用于交互式查询时,全局表似乎不起作用)
实现我正在尝试的最佳方法是什么?
我希望不创建任何分区,磁盘或内存瓶颈。
从对数和观察值中加入该值时,每个事物都有唯一的ID和相关ID的元组。
((编辑:带有图和更改标题的拓扑的切换文本描述)
解决方案2似乎可以工作,但是它发出警告,因为交互式查询需要一些时间才能准备好-所以我用Transformer实现了相同的解决方案:
@Slf4j
@Configuration
@RequiredArgsConstructor
@SuppressWarnings("unchecked")
public class LogBestObservationsSelectorProcessorConfig {
private String logScoredObservationsStore = "log-scored-observations-store";
private final Serde<LogEntryRelevantObservationIdTuple> logEntryRelevantObservationIdTupleSerde;
private final Serde<LogRelevantObservationIdsTuple> logRelevantObservationIdsTupleSerde;
private final Serde<LogEntryObservationMatchTuple> logEntryObservationMatchTupleSerde;
private final Serde<LogEntryObservationMatchIdsRelevantObservationsTuple> logEntryObservationMatchIdsRelevantObservationsTupleSerde;
@Bean
public Function<
GlobalKTable<LogEntryRelevantObservationIdTuple, LogEntryObservationMatchTuple>,
Function<
KStream<LogEntryRelevantObservationIdTuple, LogEntryRelevantObservationIdTuple>,
Function<
KTable<String, LogRelevantObservationIds>,
KStream<String, LogEntryObservationMatchTuple>
>
>
>
logBestObservationSelectorProcessor() {
return (GlobalKTable<LogEntryRelevantObservationIdTuple, LogEntryObservationMatchTuple> logScoredObservationsTable) ->
(KStream<LogEntryRelevantObservationIdTuple, LogEntryRelevantObservationIdTuple> logScoredObservationProcessedStream) ->
(KTable<String, LogRelevantObservationIdsTuple> logRelevantObservationIdsTable) -> {
return logScoredObservationProcessedStream
.selectKey((k, v) -> k.getLogId())
.leftJoin(
logRelevantObservationIdsTable,
LogEntryObservationMatchIdsRelevantObservationsTuple::new,
Joined.with(
Serdes.String(),
logEntryRelevantObservationIdTupleSerde,
logRelevantObservationIdsTupleSerde
)
)
.transform(() -> new LogEntryObservationMatchTransformer(logScoredObservationsStore))
.groupByKey(
Grouped.with(
Serdes.String(),
logEntryObservationMatchTupleSerde
)
)
.reduce(
(match1, match2) -> Double.compare(match1.getScore(), match2.getScore()) != -1 ? match1 : match2,
Materialized.with(
Serdes.String(),
logEntryObservationMatchTupleSerde
)
)
.toStream()
;
};
}
@RequiredArgsConstructor
private static class LogEntryObservationMatchTransformer implements Transformer<String, LogEntryObservationMatchIdsRelevantObservationsTuple, KeyValue<String, LogEntryObservationMatchTuple>> {
private final String stateStoreName;
private ProcessorContext context;
private TimestampedKeyValueStore<LogEntryRelevantObservationIdTuple, LogEntryObservationMatchTuple> kvStore;
@Override
public void init(ProcessorContext context) {
this.context = context;
this.kvStore = (TimestampedKeyValueStore<LogEntryRelevantObservationIdTuple, LogEntryObservationMatchTuple>) context.getStateStore(stateStoreName);
}
@Override
public KeyValue<String, LogEntryObservationMatchTuple> transform(String logId, LogEntryObservationMatchIdsRelevantObservationsTuple value) {
val observationIds = value.getLogEntryRelevantObservationsTuple().getRelevantObservations().getObservationIds();
val allObservationsProcessed = observationIds.stream()
.allMatch((observationId) -> {
val key = LogEntryRelevantObservationIdTuple.newBuilder()
.setLogId(logId)
.setRelevantObservationId(observationId)
.build();
return kvStore.get(key) != null;
});
if (!allObservationsProcessed) {
return null;
}
val observationId = value.getLogEntryRelevantObservationIdTuple().getObservationId();
val key = LogEntryRelevantObservationIdTuple.newBuilder()
.setLogId(logId)
.setRelevantObservationId(observationId)
.build();
ValueAndTimestamp<LogEntryObservationMatchTuple> observationMatchValueAndTimestamp = kvStore.get(key);
return new KeyValue<>(logId, observationMatchValueAndTimestamp.value());
}
@Override
public void close() {
}
}
}