在 2 年列上使用 SNowflake LAG()

问题描述 投票:0回答:1

我有一个逻辑来获取前几年的金额,并尝试在表中的 2 列上使用 LAG() 函数并派生 2 列(一列用于偏移量 1,第二列用于偏移量 2),

我有发票年份、发票月份、供应商 ID、供应商地点、审计年份、审计月份和金额列。

create table invoice_audit_data (invoice_year number, invoice_month number, vendor number, site number, audit_year number, audit_month number, amount number);

Insert into INVOICE_AUDIT_DATA (INVOICE_YEAR, INVOICE_MONTH, VENDOR, SITE, AUDIT_YEAR, AUDIT_MONTH, AMOUNT) 
values 
    (2019,11,248,2,2022,1,9162)
    ,(2019,11,248,2,2022,2,9529)
    ,(2019,11,248,2,2022,3,548) 
    ,(2019,11,248,2,2022,4,7796)
    ,(2019,11,248,2,2022,5,4820)
    ,(2019,11,248,2,2022,6,6376)
    ,(2019,11,248,2,2022,7,947)
    ,(2019,11,248,2,2022,8,3930)
    ,(2020,11,248,2,2022,1,9280)
    ,(2020,11,248,2,2022,2,3969)
    ,(2020,11,248,2,2022,3,3156)
    ,(2020,11,248,2,2022,4,7900)
    ,(2020,11,248,2,2022,5,2710)
    ,(2020,11,248,2,2022,6,9959)
    ,(2020,11,248,2,2022,7,2870)
    ,(2020,11,248,2,2022,8,8611)
    ,(2020,11,248,2,2022,9,1614)
    ,(2020,11,248,2,2022,10,7357)
    ,(2020,11,248,2,2022,11,3251)
    ,(2020,11,248,2,2022,12,8215)
    ,(2020,11,248,2,2023,1,7967)
    ,(2020,11,248,2,2023,2,2514)
    ,(2020,11,248,2,2023,3,114)
    ,(2021,11,248,2,2022,1,3446)
    ,(2021,11,248,2,2022,2,6165)
    ,(2021,11,248,2,2022,3,102)
    ,(2021,11,248,2,2022,4,8748)
    ,(2021,11,248,2,2022,5,6918)
    ,(2021,11,248,2,2022,6,6340)
    ,(2021,11,248,2,2022,7,2819)
    ,(2021,11,248,2,2022,8,255)
    ,(2021,11,248,2,2022,9,8121)
    ,(2021,11,248,2,2022,10,9784)
    ,(2021,11,248,2,2022,11,2604)
    ,(2021,11,248,2,2022,12,881)
    ,(2021,11,248,2,2023,1,2482)
    ,(2021,11,248,2,2023,2,9474)
    ,(2021,11,248,2,2023,3,1662)
    ,(2021,11,248,2,2023,4,8422);

我厌倦了使用 LAG() 返回发票年度,然后使用另一个 LAG() 返回审计年度,但在数据发票年度 2020 中,审计年度 2023 只有 3 个月,所以我的结果是 0。

  1. 我想使用此表编写 CTE,计算上一个发票年/月和上一个审计年/月的金额并显示在主选择中。例如,如果我的发票年份-mm 是 2021-11,审计年份-mm 是 2023-4,则此列将从发票年份-mm 2020-11 和审计年份-mm 2022-4 获取金额。因此 LAG() 必须应用于发票年度和审计年度。
  2. 我想添加另一栏来做同样的事情回到两年前。
snowflake-cloud-data-platform lag
1个回答
0
投票

最简单的解决方法是通过 JOIN 来解决:

select a.*
    ,b.audit_year 
    ,b.audit_month
    ,b.amount
from INVOICE_AUDIT_DATA as a
left join INVOICE_AUDIT_DATA as b 
    on a.vendor = b.vendor and a.site = b.site
     and a.invoice_year = b.invoice_year and a.invoice_month = b.invoice_month
     and a.audit_year-1 = b.audit_year and a.audit_month = b.audit_month
order by 1,2,5,6;

因此 2 年是:

select a.*
    ,b.audit_year as ay_m1
    --,b.audit_month 
    ,b.amount as ay_m1_amount
    ,c.audit_year as ay_m2
    --,c.audit_month
    ,c.amount as ay_m2_amount
from INVOICE_AUDIT_DATA as a
left join INVOICE_AUDIT_DATA as b 
    on a.vendor = b.vendor and a.site = b.site
     and a.invoice_year = b.invoice_year and a.invoice_month = b.invoice_month
     and a.audit_year-1 = b.audit_year and a.audit_month = b.audit_month
left join INVOICE_AUDIT_DATA as c
    on a.vendor = c.vendor and a.site = c.site
     and a.invoice_year = c.invoice_year and a.invoice_month = c.invoice_month
     and a.audit_year-2 = c.audit_year and a.audit_month = c.audit_month
order by 1,2,5,6;

没有添加新信息,因为数据不涵盖时间范围。

现在,鉴于您的数据非常密集,正如它看起来的那样(又名发票年/月集,审计年/月中没有间隙),您可以使用固定偏移滞后,因此:

select a.*
    ,lag(amount, 12)over(partition by invoice_year, invoice_month, vendor, site order by audit_year, audit_month) as ay_m1_amount
    ,lag(amount, 24)over(partition by invoice_year, invoice_month, vendor, site order by audit_year, audit_month) as ay_m2_amount
from INVOICE_AUDIT_DATA as a
order by 1,2,5,6;

给出相同的结果:

© www.soinside.com 2019 - 2024. All rights reserved.