用雪花来区分和分组的SQL查询。

问题描述 投票:0回答:1

我想实现的是获取某个MPN的所有记录,但是,我只想要最新的。DeliveryDateshpm 但鉴于 MAX 函数需要在group by子句中,它并不是获取最新的记录,而是获取所有的记录,因为分明的 DeliveryDate它得到两个记录而不是一个,我怎么能实现这一点?这是在雪花中。

这是我的SQL代码

SELECT
    MD.MPN,
    MD.LOTCODE,
    MD.DATECODE,
    SHIP.ITEMCODE AS SYSTEMPARTNUMBER, 
    SHIP.SERIALNUMBER AS SYSTEMSERIALNUMBER, 
    SHIP.CUSTOMERNAME, 
    SHIP.SHIPTOADDRESS AS ADDRESS,
    SUM(IFNULL(SHIP.QUANTITY,0)) AS QUANTITY,
    SHIP.DELIVERYDATE
FROM cunits UNITS
   JOIN unc UC ON UC.CHILDUNITID = UNITS.ID
   JOIN shpm SHIP ON SHIP.SERIALNUMBER = UC.SYSSN
   JOIN tsern SN ON SN.UNITID = UNITS.ID
   JOIN machined MD ON MD.SERIALNUMBER = SN.SERIALNUMBER     
WHERE --SYSTEMSERIALNUMBER = '001801055469' and 
MPN = 'XC0402A105KP5CNN-S'
GROUP BY MD.MPN,MD.LOTCODE,MD.DATECODE,SHIP.ITEMCODE,SHIP.SERIALNUMBER,SHIP.CUSTOMERNAME,SHIP.SHIPTOADDRESS
sql group-by distinct snowflake-cloud-data-platform
1个回答
2
投票

所以猜测一些数据要匹配到SQL

WITH cunits AS (
    SELECT * from values (1) v(id)
), unc AS (
    SELECT * FROM VALUES (1,'123') v(CHILDUNITID,SYSSN)
), shpm AS (
    SELECT * FROM VALUES ('a', '123', 10, '2020-02-01'),
       ('a', '123', 20, '2020-01-01') 
   v(ITEMCODE, SERIALNUMBER, QUANTITY, DELIVERYDATE)
), tsern AS (
    SELECT * FROM VALUES (1,'zxc') v(UNITID,SERIALNUMBER)
), machined as (
    SELECT * FROM VALUES ('zxc', 'XC0402A105KP5CNN-S') v(SERIALNUMBER, MPN)
)

并从示例中删除一些无关紧要的列。

SELECT
    MD.MPN,
    SHIP.ITEMCODE AS SYSTEMPARTNUMBER, 
    SHIP.SERIALNUMBER AS SYSTEMSERIALNUMBER, 
    SUM(IFNULL(SHIP.QUANTITY,0)) AS QUANTITY,
    SHIP.DELIVERYDATE
FROM cunits UNITS
   JOIN unc UC ON UC.CHILDUNITID = UNITS.ID
   JOIN shpm SHIP ON SHIP.SERIALNUMBER = UC.SYSSN
   JOIN tsern SN ON SN.UNITID = UNITS.ID
   JOIN machined MD ON MD.SERIALNUMBER = SN.SERIALNUMBER     
WHERE 
MPN = 'XC0402A105KP5CNN-S'
GROUP BY MD.MPN,SHIP.ITEMCODE,SHIP.SERIALNUMBER;

现在 SHIP.DELIVERYDATE 必须加到 group by 子句,否则这段代码将永远不会运行,甚至忽略你不希望看到的 2020-01-01 资料

一旦你添加了,你就会得到两行你不想要的内容。

MPN SYSTEMPARTNUMBER    SYSTEMSERIALNUMBER  QUANTITY    DELIVERYDATE
XC0402A105KP5CNN-S  a   123 10  2020-02-01
XC0402A105KP5CNN-S  a   123 20  2020-01-01

戈登的解决方案是,添加一个 QUALIFY

QUALIFY ROW_NUMBER() OVER (PARTITION BY MD.MPN, SHIP.SERIALNUMBER ORDER BY SHIP.DELIVERYDATE DESC) = 1;

正确地给出了答案,但计算所有的结果和修剪那些不想要的后者......其中取决于你的数据集的大小和多少行在你的 shpm 表,用CTE来预过滤可能会更好。

WITH cunits AS (
    SELECT * from values (1) v(id)
), unc AS (
    SELECT * FROM VALUES (1,'123') v(CHILDUNITID,SYSSN)
), shpm AS (
    SELECT * FROM VALUES ('a', '123', 10, '2020-02-01'),
       ('a', '123', 20, '2020-01-01') 
   v(ITEMCODE, SERIALNUMBER, QUANTITY, DELIVERYDATE)
), tsern AS (
    SELECT * FROM VALUES (1,'zxc') v(UNITID,SERIALNUMBER)
), machined as (
    SELECT * FROM VALUES ('zxc', 'XC0402A105KP5CNN-S') v(SERIALNUMBER, MPN)
), pre_filtered_shpm AS (
    select * from shpm
    QUALIFY ROW_NUMBER() OVER (PARTITION BY SERIALNUMBER ORDER BY DELIVERYDATE DESC) = 1
)
SELECT
    MD.MPN,
    SHIP.ITEMCODE AS SYSTEMPARTNUMBER, 
    SHIP.SERIALNUMBER AS SYSTEMSERIALNUMBER, 
    SUM(IFNULL(SHIP.QUANTITY,0)) AS QUANTITY,
    SHIP.DELIVERYDATE
FROM cunits UNITS
   JOIN unc UC ON UC.CHILDUNITID = UNITS.ID
   JOIN pre_filtered_shpm SHIP ON SHIP.SERIALNUMBER = UC.SYSSN
   JOIN tsern SN ON SN.UNITID = UNITS.ID
   JOIN machined MD ON MD.SERIALNUMBER = SN.SERIALNUMBER     
WHERE 
MPN = 'XC0402A105KP5CNN-S'
GROUP BY MD.MPN,SHIP.ITEMCODE,SHIP.SERIALNUMBER,SHIP.DELIVERYDATE;

2
投票

使用 ROW_NUMBER()QUALIFY:

SELECT MD.MPN, MD.LOTCODE, MD.DATECODE,
       SHIP.ITEMCODE AS SYSTEMPARTNUMBER, SHIP.SERIALNUMBER AS SYSTEMSERIALNUMBER, 
       SHIP.CUSTOMERNAME, SHIP.SHIPTOADDRESS AS ADDRESS,
       SUM(COALESCE(SHIP.QUANTITY, 0)) AS QUANTITY,
       SHIP.DELIVERYDATE
FROM cunits UNITS JOIN
     unc UC
     ON UC.CHILDUNITID = UNITS.ID JOIN
     shpm SHIP
     ON SHIP.SERIALNUMBER = UC.SYSSN JOIN
     tsern SN
     ON SN.UNITID = UNITS.ID JOIN
     machined MD
     ON MD.SERIALNUMBER = SN.SERIALNUMBER     
WHERE '001801055469' and MPN = 'XC0402A105KP5CNN-S'
GROUP BY MD.MPN, MD.LOTCODE, MD.DATECODE, SHIP.ITEMCODE, SHIP.SERIALNUMBER, SHIP.CUSTOMERNAME, SHIP.SHIPTOADDRESS
QUALIFY ROW_NUMBER() OVER (PARTITION BY MD.MPN, SHIP.SERIALNUMBER ORDER BY SHIP.SHIPDATE DESC) = 1;

每行返回 MPN这就是我对你问题的解释。 您可能需要在 PARTITION BY 也是。

© www.soinside.com 2019 - 2024. All rights reserved.