我有一张类似的桌子
ymd | cus_id | 订购 | 订单值 | 退款 |
---|---|---|---|---|
2023-01-01 | 12020 | 3 | 134 | 1 |
2023-06-04 | 27383 | 1 | 80 | 0 |
2023-07-13 | 23823 | 2 | 111 | 2 |
2023-04-22 | 12020 | 7 | 323 | 3 |
但是大很多,记录了每个cus_id及其订单,order_value,每天的退款。
我需要总结这个表,每个 cus_id 以及不同日期范围的订单、订单值、退款总和(完整数据集,过去 4 周、8 周和 12 周)各占一行。最终结果将类似于下面,因此将获得每个 cus_id 的 4 个日期范围。
日期范围 | cus_id | 总和订单 | 订单总和值 | 退款总额 |
---|---|---|---|---|
全部 | 12020 | 23 | 1340 | 9 |
4周 | 12020 | 3 | 152 | 1 |
8周 | 12020 | 8 | 423 | 2 |
12周 | 12020 | 20 | 1023 | 7 |
表中的 n.b 值是虚构的,因此两个数据集之间可能不匹配
最好的方法是什么?我正在考虑分别计算每个日期范围并添加一个新的
date_range
列,然后添加所有 4 个日期范围的并集,因此最终结果会像这样,但不确定这是否是最有效的方法。
将在“date_range”CTE(通用表表达式)中创建一个选项,定义每个范围,然后将表连接到该 CTE。
CTE 只是 4 个不同选择语句的并集,为每个日期范围提供一行,并带有开始日期和结束日期。我添加了一个 sort_order 列来帮助对最终结果进行排序。
这是一个工作示例,这与您上面的输出不匹配,因为未提供该数据集。我使用了您提供的示例数据并添加了一些我自己的数据:
with date_range as (
select 'all' as date_range,
cast('1900-01-01' as date) as start_date,
current_date as end_date,
1 as sort_order
union
select '4 weeks' as date_range,
cast((current_date - interval '28' day) as date) as start_date,
current_date as end_date,
2 as sort_order
union
select '8 weeks' as date_range,
cast((current_date - interval '56' day) as date) as start_date,
current_date as end_date,
3 as sort_order
union
select '12 weeks' as date_range,
cast((current_date - interval '84' day) as date) as start_date,
current_date as end_date,
4 as sort_order
),
sample_data as(
select *
From (
values(cast('2023-01-01' as date), 12020, 3, 134, 1),
(cast('2023-06-04' as date), 27383, 1, 80, 0),
(cast('2023-07-13' as date), 23823, 2, 111, 2),
(cast('2023-04-22' as date), 12020, 7, 323, 3),
(cast('2023-07-22' as date), 12020, 7, 400, 4),
(cast('2023-08-20' as date), 27383, 9, 100, 0)
) as test_date(ymd, cus_id, "order", order_value, refunds)
)
select date_range,
cus_id,
sum("order") as sum_order,
sum(order_value) as sum_order_value,
sum(refunds) as sum_refunds,
sort_order
from date_range dt
join sample_data td on td.ymd between dt.start_date and dt.end_date --here is where you would add your own table removing the sample_data cte above
group by date_range,
cus_id,
sort_order
order by cus_id,
sort_order