包含多个连接的 Spark 流式查询没有输出

问题描述 投票:0回答:0

我有一个连接查询,它有另一个连接查询作为子查询,但查询没有给出任何输出。我单独运行子查询来找出问题所在并且它按预期工作。

我想在这里实现的可以被认为是在他们正式创建之前找到每个班级的最后一个成员,有两个流,一个用于成员,一个用于班级。

我将不胜感激任何可以使这个查询工作或任何其他解决方案建议的帮助。

Person | ClassNo | Timestamp
---
James | 1     | 12345  
Sally | 1     | 27251  
Peter | 1     | 40232
Jake  | 2     | 16780  
Paul  | 2     | 43628
Mark  | 2     | 78221
ClassNo | Creator | Timestamp
---
1 | Lizzy | 37220
2 | David | 22980

期望的输出:

Person | ClassNo | Timestamp | Creator | Timestamp
---
Sally | 1     | 27251  | Lizzy | 37220
Jake  | 2     | 16780  | David | 22980

我运行的查询:

select * 
from Member
join (Select Class.ClassNo, 
         Class.Creator, 
         Class.Timestamp, 
         max(Member.Timestamp) as MaxTimestamp 
     from Class join Member on Member.ClassNo = Class.ClassNo and 
         Member.Timestamp <= Class.Timestamp 
     group by Class.ClassNo, Class.Creator, Class.Timestamp) as ClassTemp on 
Member.ClassNo = ClassTemp.ClassNo and
Member.Timestamp = ClassTemp.MaxTimestamp;
apache-spark apache-spark-sql spark-streaming spark-structured-streaming
© www.soinside.com 2019 - 2024. All rights reserved.