如何在Hive中使用递归查询

问题描述 投票:1回答:1

它有空白数据。

ID页时间戳序列Orestes登录152356 1Orestes帐户视图152368Orestes转让152380Orestes帐户视图162382 2Orestes贷款162393Antigone登录152382 1Antigone Transfer 152390

我想像下面那样更改它。

ID页时间戳序列Orestes登录152356 1Orestes帐户视图152368 1Orestes转移152380 1Orestes帐户视图162382 2Orestes贷款162393 2Antigone登录152382 1Antigone Transfer 152390 1

我尝试过...

with r1
as
(select id, page, timestamp, lag(sequence) over (partition id order by timestamp) as sequence from log)
r2
as
(select id, page, timestamp, sequence from log)
insert into test1
select a.id, a.page, a.timestamp, case when a.sequence is not null then a.sequence
                                       when b.sequence is not null then b.sequence 
                                       else a.sequence
                                   end
from r1 a join r2 b on a.id=b.id and a.timestamp=b.timestamp
;
create table test2 like test1
;
with r1
as
(select id, page, timestamp, lag(sequence) over (partition id order by timestamp) as sequence from test1)
r2
as
(select id, page, timestamp, sequence from test1)
insert into test2
select a.id, a.page, a.timestamp, case when a.sequence is not null then a.sequence
                                       when b.sequence is not null then b.sequence 
                                       else a.sequence
                                   end
from r1 a join r2 b on a.id=b.id and a.timestamp=b.timestamp
;
create table test3 like test2
;
and it repeat to fill another blank until my fingers are numb...

如上所述,我如何在前面的数字中填入空白?我认为我应该使用递归查询,但找不到方法。

hive recursive-query
1个回答
0
投票

您根本不需要递归查询。

Hive中有两个功能可以为您提供帮助:

所以您的查询应类似于:

create table tmp_table like original_table;

insert into tmp_table
SELECT
    id, 
    page, 
    ts,
    COALESCE(sequence, 
             LAST_VALUE(sequence, TRUE) OVER(ORDER BY ts ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW))
FROM original_table;
© www.soinside.com 2019 - 2024. All rights reserved.