按不同开始日期(客户购买的第一天)计算7天的销售额,以每7天计算每个客户的平均购买量

问题描述 投票:0回答:3

从表中,我现在有第一张表,并试图从这个客户购买的第一天开始每7天的销售额。例子是表 2

购买日期 客户编号 销售单位
2018-01-01 1 10
2018-01-02 1 5
2018-01-05 2 3
2018-01-15 1 10
2018-01-20 2 4
2018-01-21 2 5
购买日期 客户编号 销售单位 每7天累计销售额
2018-01-01 1 10 10
2018-01-02 1 5 15
2018-01-15 1 10 10
2018-01-05 2 3 3
2018-01-20 2 4 9
2018-01-21 2 5 9

决赛桌应该是这样的:

采购周 客户编号 7天销售单位
2018-01-01 1 15
2018-01-05 2 3
2018-01-15 1 10
2018-01-20 2 4

然后我可以计算每个客户的平均销售额

客户编号 每 7 天销售单位的平均值 计算
1 12.5 (15+10) /2
2 3.5 (3+4) /2

困难的部分是:

  1. 每个顾客的第一天购买都不一样

  2. 购买日期不是后继的,所以我不能使用unbonded或following 6 rows等

  3. 整个数据集有5年,所以我不能手动-7、-14等

  4. 我尝试使用 date_trunc('week',date, min(date) over (partition by customerid))

  5. 还尝试在 6 个处理行和当前行之间按行进行分区。但是日期不是结果所以不起作用

sql window-functions presto
3个回答
0
投票

您可以通过查看日期的案例陈述来做到这一点。我在 SQL Server 中执行此操作,但我相信它适用于 Presto。我认为 DATEADD 在 Presto 中可能需要是“Date_Add”(带引号)。

你还提到你可能需要 14 天,所以我为此添加了一个专栏。你可以看到这只是在 DateAdd 函数中更改日期的问题。

小提琴

   SELECT t1.purchaseDate,
          t1.CustomerID,
          t1.SalesUnit,
          SUM(CASE 
                 WHEN t2.purchaseDate BETWEEN DATEADD(DAY, -6, t1.purchaseDate) AND t1.purchaseDate THEN t2.salesUnit 
               END) AS SalesLast7,
          SUM(CASE 
                 WHEN t2.purchaseDate BETWEEN DATEADD(DAY, -13, t1.purchaseDate) AND t1.purchaseDate THEN t2.salesUnit 
               END) AS SalesLast14  
     FROM temp t1
LEFT JOIN temp t2 ON t1.customerID = t2.customerID AND t2.purchaseDate IS NOT NULL
 GROUP BY t1.purchaseDate, t1.customerID, t1.salesUnit
购买日期 客户编号 销售部 SalesLast7 SalesLast14
2018-01-01 1 10 10 10
2018-01-02 1 5 15 15
2018-01-05 2 3 3 3
2018-01-15 1 10 10 15
2018-01-20 2 4 4 4

0
投票

您可以通过 2 个步骤使用 SQL 窗口函数来获得您想要的结果:

步骤 1. 按每个客户应用窗口分区并获取每个客户的 first_purchase_date。之后,使用 Presto date_diff() 函数计算从第一次购买日期到当前购买日期的日期差异。将它除以 7 得到从购买的第一个日期算起的 week_bucket。

第 2 步。按每个 (customer, customer_sale_week_bucket) 分组并在每个 (customer, customer_sale_week_bucket) 分区中获取 sum(sales_unit) 和 min(purchase_date)。

这里是查询

with orders_with_customer_week_bucket AS 
(
  select
   purchase_date,
   customer_id,
   sales_unit,
   date_diff(day,min(purchase_date) over (partition by customer_id), purchase_date) / 7 as customer_sale_week_bucket
from
   orders
 )
 select
     purchase_week,
     customer_id,
     seven_day_sales_unit
 from
     (select
         customer_id,
         customer_sale_week_bucket,
         min(purchase_date) as purchase_week,
         sum(sales_unit) as seven_day_sales_unit
     from
        orders_with_customer_week_bucket
     GROUP BY
        customer_id,
        customer_sale_week_bucket
     )r
采购周 customer_id seven_day_sales_unit
2018-01-01 1 15
2018-01-05 2 3
2018-01-15 1 10
2018-01-20 2 9

0
投票

这是伪代码,因为我不知道 Presto 函数。在我回来查看之前,你需要翻译日期数学:

select distinct customerid,
    (
    case when lag(purchasedate) over (partition by customerid order by purchasedate) >= purchasedate - 6 then sum(salesunit) end +
    case when lag(purchasedate) over (partition by customerid order by purchasedate) >= purchasedate - 5 then sum(salesunit) end +
    case when lag(purchasedate) over (partition by customerid order by purchasedate) >= purchasedate - 4 then sum(salesunit) end +
    case when lag(purchasedate) over (partition by customerid order by purchasedate) >= purchasedate - 3 then sum(salesunit) end +
    case when lag(purchasedate) over (partition by customerid order by purchasedate) >= purchasedate - 2 then sum(salesunit) end +
    case when lag(purchasedate) over (partition by customerid order by purchasedate) >= purchasedate - 1 then sum(salesunit) end +
    sum(salesunit)
    ) / count(*) over (partition by customerid order)   
from T
group by customerid, purchasedate
© www.soinside.com 2019 - 2024. All rights reserved.