我正在寻找一个窗口函数,通过比较当前记录中的值直到找到最近的匹配项来检索
BigQuery
中上一条记录中的值。这是一个示例表:
with a as (
select 1 as hitNumber, 'xx' as sessionID, 'aaa' as pageName, null as convPageName
union all
select 2 as hitNumber, 'xx' as sessionID, 'ccc' as pageName, 'bbb' as convPageName
union all
select 3 as hitNumber, 'xx' as sessionID, 'ccc' as pageName, null as convPageName
union all
select 4 as hitNumber, 'xx' as sessionID, 'ddd' as pageName, 'qqq' as convPageName
union all
select 5 as hitNumber, 'xx' as sessionID, 'eee' as pageName, 'ccc' as convPageName
)
select *, ??? as prevConvPageName from a
基于此示例,查看
hitNumber = 5
,我想知道 convPageName
对应于最新记录,其中 convPageName
与 pageName
匹配,并且匹配记录上的 convPageName
不为空。在本例中,结果将为 bbb
,因为 convPageName
的 hitNumber = 5
与 pageName
的 hitNumber = 2
匹配。
这可以通过创建从最新的 pageName 到 convPageName 的映射表来解决。在下面的代码中,这被命名为
page_name_2_conv_page_name
。
请注意,我在表
a
中添加了额外的行,以证明使用了表 a
中基于 hitNumber 的最新记录。
WITH
a AS (
SELECT
1 AS hitNumber,
'xx' AS sessionID,
'aaa' AS pageName,
NULL AS convPageName
UNION ALL
SELECT 2,'xx','ccc','fff'
UNION ALL
SELECT 3,'xx','ccc','bbb'
UNION ALL
SELECT 4,'xx','ccc', null
UNION ALL
SELECT 5,'xx','ddd', 'qqq'
UNION ALL
SELECT 6,'xx','eee', 'ccc'
),
page_name_2_conv_page_name as (
select pageName,convPageName
from a
where convPageName is not null
QUALIFY row_number() over (partition by pageName order by hitNumber desc) = 1
)
select
a.*,
page_name_2_conv_page_name.convPageName
from
a left join page_name_2_conv_page_name
on a.convPageName = page_name_2_conv_page_name.pageName
结果