我正在尝试窗口化数据流,对于每个窗口我需要该窗口中的值列表,并且这样做我创建了一个自定义avro架构,其中有一个字段records
,这是一个Input
列表。聚合函数有Materialized
部分,因为我有this问题。
KStream<String, Input> windowedStream = timestampFilteredStream
.groupByKey()
.windowedBy(TimeWindows.of(Duration.ofSeconds(10)).grace(Duration.ofSeconds(5)))
.aggregate(
() -> new InputList(new ArrayList<>()),
(key, value, aggregate) -> {
if (value != null) {
aggregate.getRecords().add(value);
}
return aggregate;
},
Materialized.with(Serdes.String(), new SpecificAvroSerde<>())
)
.suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded()))
.toStream()
.map((window, inputs) -> {
long windowEnd = window.window().endTime().toEpochMilli();
String sensorId = window.key();
Double weightInc = inputs.getRecords().stream().mapToDouble(Input::getWeightActual).reduce(0, (a, b) -> b - a);
Double lengthInc = inputs.getRecords().stream().mapToDouble(Input::getLengthActual).reduce(0, (a, b) -> b - a);
Double unitsInc = inputs.getRecords().stream().mapToDouble(Input::getUnitsActual).reduce(0, (a, b) -> b - a);
Double avgSpeed = inputs.getRecords().stream().mapToDouble(Input::getSpeedActual).average().orElse(0);
return KeyValue.pair(sensorId, new Input(windowEnd, weightInc, lengthInc, unitsInc, avgSpeed));
});
windowedStream.foreach((sensorId, input) -> {
System.out.println(sensorId + " with computed input " + input);
});
堆栈跟踪是:
Exception in thread "sensors-pipeline-35077a4f-40f8-4356-9e56-53938f52c321-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Exception caught in process. taskId=0_0, processor=KSTREAM-SOURCE-0000000000, topic=sensors, partition=0, offset=0, stacktrace=org.apache.kafka.common.errors.SerializationException: Error serializing Avro message
Caused by: java.lang.NullPointerException
at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:82)
at io.confluent.kafka.serializers.KafkaAvroSerializer.serialize(KafkaAvroSerializer.java:53)
at io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:65)
at io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:38)
at org.apache.kafka.streams.state.StateSerdes.rawValue(StateSerdes.java:191)
at org.apache.kafka.streams.state.internals.MeteredWindowStore.put(MeteredWindowStore.java:117)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl$WindowStoreReadWriteDecorator.put(ProcessorContextImpl.java:484)
at org.apache.kafka.streams.kstream.internals.KStreamWindowAggregate$KStreamWindowAggregateProcessor.process(KStreamWindowAggregate.java:127)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:183)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:162)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:122)
at org.apache.kafka.streams.kstream.internals.KStreamTransformValues$KStreamTransformValuesProcessor.process(KStreamTransformValues.java:56)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:183)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:162)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:122)
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:87)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:364)
at org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.process(AssignedStreamsTasks.java:199)
at org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:420)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:890)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:381)
at org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.process(AssignedStreamsTasks.java:199)
at org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:420)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:890)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)
Caused by: org.apache.kafka.common.errors.SerializationException: Error serializing Avro message
Caused by: java.lang.NullPointerException
at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:82)
at io.confluent.kafka.serializers.KafkaAvroSerializer.serialize(KafkaAvroSerializer.java:53)
at io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:65)
at io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:38)
at org.apache.kafka.streams.state.StateSerdes.rawValue(StateSerdes.java:191)
at org.apache.kafka.streams.state.internals.MeteredWindowStore.put(MeteredWindowStore.java:117)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl$WindowStoreReadWriteDecorator.put(ProcessorContextImpl.java:484)
at org.apache.kafka.streams.kstream.internals.KStreamWindowAggregate$KStreamWindowAggregateProcessor.process(KStreamWindowAggregate.java:127)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:183)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:162)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:122)
at org.apache.kafka.streams.kstream.internals.KStreamTransformValues$KStreamTransformValuesProcessor.process(KStreamTransformValues.java:56)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:183)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:162)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:122)
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:87)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:364)
at org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.process(AssignedStreamsTasks.java:199)
at org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:420)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:890)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)
尝试调试io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:82)
显示错误是此代码的getId
部分
schema = AvroSchemaUtils.getSchema(object);
int id;
if (this.autoRegisterSchema) {
restClientErrorMsg = "Error registering Avro schema: ";
id = this.schemaRegistry.register(subject, schema);
} else {
restClientErrorMsg = "Error retrieving Avro schema: ";
id = this.schemaRegistry.getId(subject, schema);
}
我试图使用提供的主题从架构注册表中获取类型,我可以看到架构。
问题似乎是this.schemaRegistry
为null,但是在我设置的属性中
props.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081");
我已经通过将类型定义移动到groupByKey
方法来修复它,使用
.groupByKey(Grouped.with(Serdes.String(), new SpecificAvroSerde<>()))
更新:
另一个选项不起作用,因为值需要使用特定的avro serde,可以像这样修复:
final Serde<InputList> valueSpecificAvroSerde = new SpecificAvroSerde<>();
final Map<String, String> serdeConfig = Collections.singletonMap("schema.registry.url", "http://localhost:8081");
valueSpecificAvroSerde.configure(serdeConfig, false);
Materialized.with(Serdes.String(), valueSpecificAvroSerde)