如何在 SQL 中计算帐户随时间推移购买的 SKU 的不同数量?

问题描述 投票:0回答:1

我有一个表,其中包含按帐户、购买日期以及当天购买的 sku 列出的购买信息。该表是一个销售订单行项目表,因此每个记录代表一个单独的单一项目采购(即所有记录的数量 = 1)。

我想在输出上创建一个表,它会告诉我,截至特定日期,给定帐户截至该日期购买的唯一 sku 的总数是多少,从帐户首次购买日期开始计数。

例如,使用下面的数据

create schema adhoc_data.temp;
create table adhoc_data.temp.purchases ( accountid varchar, purchase_date date, productid varchar);
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-01', '534ad451f');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-02', '534ad451f');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-03', '534ad451f');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-04', '534ad451f');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-05', '0f9d321ad');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-06', '0f9d321ad');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-07', '534ad451f');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-08', '4a5d93a1f');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-09', '4a5d93a1f');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-10', '4a5d93a1f');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-10', '534ad451f');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-10', '0f9d321ad');
insert into purchases (accountid, purchase_date , productid) values ('a1', '2022-01-11', '9cd018fc0');

我正在计算

'a1', '2022-01-01', 1
'a1', '2022-01-02', 1
'a1', '2022-01-03', 1
'a1', '2022-01-04', 1
'a1', '2022-01-05', 2
'a1', '2022-01-06', 2
'a1', '2022-01-07', 2
'a1', '2022-01-08', 3
'a1', '2022-01-09', 3
'a1', '2022-01-10', 3
'a1', '2022-01-11', 4

我尝试在购买表上使用内部连接作为

select 
t1.accountid, 
t1.purchasedate,
count(distinct t1.productid)
from
purchases as t1
left join purchases as t2 on t1.accountid = t2.accountid and t1.purchase_date >= t2.purchase_date
group by 
t1.accountid, 
t1.purchasedate,
order by 
t1.accountid,
t2.purchasedate

But that's not working. 

Any help would be appreciated.


sql snowflake-cloud-data-platform
1个回答
0
投票

鉴于您提供的上下文和演示数据集,以下查询应产生所需的输出。

SELECT 
  accountid, 
  purchase_date,
  (
    SELECT COUNT(DISTINCT p2.productid) 
    FROM purchases p2 
    WHERE p2.accountid = p1.accountid 
      AND p2.purchase_date <= p1.purchase_date
  ) AS unique_skus_count
FROM purchases p1
GROUP BY p1.accountid, p1.purchase_date
ORDER BY p1.accountid, p1.purchase_date;

希望这对您有帮助!

© www.soinside.com 2019 - 2024. All rights reserved.