kohorts 的 SQL 数据透视表

问题描述 投票:0回答:1

我有一张过去 5 个月的客户活动表。该示例如下所示:

id_client 月份_数字 活动
2023-10-01 1234 1
2023-11-01 1234 2
2023-12-01 1234 3 1
2024-01-01 1234 4 0
2024-02-01 1234 5 0

其中1=活跃,0=不活跃,NULL=不活跃,因为他还没有注册(我想保留这些NULL)

我想要达到的目标是:

id_client month_number_1 month_number_2 month_number_3 月份_编号_4 月份_编号_5
2023-10-01 1234 1 0
2023-11-01 1234 1 0 0
2023-12-01 1234 1 0 0
2024-01-01 1234 0 0
2024-02-01 1234 0

我想我应该使用某种枢轴,但我不知道如何。

sql google-bigquery
1个回答
0
投票

使用pivot,数据从行转换为列。然而,这里的任务是未来行的数据显示在额外的列中。这是通过 window 函数执行的。

我将

null
更改为
-1
,以便更容易看到 SQL 查询对缺失条目的反应。

这里计算出的

yyyymm
列用于查找下个月,也许可以使用
month_number
列来代替。

WITH sample as (Select month, 1234 id_client, offset+1 as month_number, 
case offset  when 2 then 1 when 3 then 0 when 4 then 0 else null end as activity 
from 
unnest(generate_date_array(date"2023-10-01", date"2024-02-01",interval 1 month)) as month with offset 
),
tbl1 as  (
Select * except(activity),
ifnull(activity,-1) as activity, # replace null entries with -1
extract(year from month)*12 +  extract(month from month) as yyyymm, # we need to access the next month by a value
from sample
)

select *,
any_value(activity) over win1 as activity_after_1month,
any_value(activity) over win2 as activity_after_2month,
any_value(activity) over win3 as activity_after_3month,
any_value(activity) over win4 as activity_after_4month,
any_value(activity) over win5 as activity_after_5month,

from tbl1
window
win1 as (partition by id_client order by yyyymm range between 1 following and 1 following),
win2 as (partition by id_client order by yyyymm range between 2 following and 2 following),
win3 as (partition by id_client order by yyyymm range between 3 following and 3 following),
win4 as (partition by id_client order by yyyymm range between 4 following and 4 following),
win5 as (partition by id_client order by yyyymm range between 5 following and 5 following)
© www.soinside.com 2019 - 2024. All rights reserved.