使用Case语句连接表

问题描述 投票:0回答:1

我正在尝试将库存与产品进行匹配,但他们的遗留数据中没有产品 ID 的概念,因此连接相当复杂。

我让它与一堆 OR 一起工作,运行需要一个多小时(即使有良好的索引)。我更喜欢使用 CASE 来减少处理,从而减少运行时间,但我在

r.[Company] = 'CAN' AND
行中的 = 处收到错误

我尝试将其简化为第一种情况,然后添加

ELSE 1 END
,但仍然出现相同的错误,所以我不确定出了什么问题。

SELECT * FROM
    [raw_inventory] r
    LEFT JOIN [master_products] mp ON
       CASE
        /* Logic:
            WHEN: Company = CAN AND Shape = RD or Blank/NULL
            THEN: Match mNum, Shape and Width
                  Only in 1, 2, 3 Files
        */
        WHEN 
            r.[Company] = 'CAN' AND
           (r.[Shape] = 'RD' OR r.[Shape] = '' OR r.[Shape] IS NULL)
        THEN  
            r.[SHAPE] = mp.[Shape] AND
            r.[MASTERNUM] IN ( SELECT value FROM STRING_SPLIT(mp.[mNum],';') ) AND
            CAST(oa.[WIDTH] AS Decimal(20,4)) = CAST(mp.[Width] AS Decimal(20,4)) AND
           (mp.[File] = 1 OR mp.[File] = 2 OR [File] = 3)
           
        /* Logic:
            WHEN: Company = CAN AND Shape = SQ, NO
            THEN: Match mNum, Shape and Width
                  Only in 1, 2, 3 Files
        */
        WHEN
            r.[Company] = 'CAN' AND
            r.[Shape] IN ('SQ', 'NO')
        THEN  
            r.[SHAPE] = mp.[Shape] AND
            r.[MASTERNUM] IN ( SELECT value FROM STRING_SPLIT(mp.[mNum],';') ) AND
            CAST(oa.[LENGTH] AS Decimal(20,4)) = CAST(mp.[Length] AS Decimal(20,4)) AND
           (mp.[File] = 1 OR mp.[File] = 2 OR [File] = 3)

        /* Logic:
            Company = US
            Master Number = 003(alpha)
            Match Run Number
            Looks in All Files
        */
        WHEN
             r.[Company] = 'US' AND
            PATINDEX('%003[A-TV-Za-tv-z]%', r.[MASTERNUM]) = 1
        THEN
            r.[Run] IN ( SELECT value FROM STRING_SPLIT(mp.[Run Number],';') ) OR
            r.[Run] IN ( SELECT value FROM STRING_SPLIT(mp.[Indexer],';') )
            
        /* Logic:
            Company = US
            Master Number = 003U
            Match RDI Number
            Only in 4, 5 Files
        */
        WHEN
            r.[Company] = 'US' AND
            r.[MASTERNUM] LIKE '003U%'
        THEN
            r.[RDI] = mp.[RDI] AND
           ([File] = '4' OR [File] = '4')

        /* Logic:
            Company = US
            Match Master Number to Run/Indexer Number
            Looks in All Files
        */
        ELSE
         r.[MASTERNUM] IN ( SELECT value FROM STRING_SPLIT(mp.[Run Number],';') ) OR
         r.[MASTERNUM] IN ( SELECT value FROM STRING_SPLIT(mp.[Indexer],';') ) 
    END
sql sql-server case
1个回答
0
投票

对此已经有很多很好的评论,但是您不能使用 case expression 来返回连接条件 - 这正是您尝试做的。因此,只需实现布尔逻辑来分隔需要实现的各种条件集。 (注意:这里相当于“WHEN/ELSE”的布尔值是“OR”)

如果执行此操作,查询将类似于以下内容(注意:我不保证 100% 准确的“翻译”,但希望接近)。

SELECT *
FROM [raw_inventory] r
LEFT JOIN [master_products] mp ON (
        (
            r.[Company] = 'CAN'
            AND (
                r.[Shape] = 'RD'
                OR r.[Shape] = ''
                OR r.[Shape] IS NULL
                )
            AND r.[SHAPE] = mp.[Shape]
            AND r.[MASTERNUM] IN (
                SELECT value
                FROM STRING_SPLIT(mp.[mNum], ';')
                )
            AND CAST(oa.[WIDTH] AS DECIMAL(20, 4)) = CAST(mp.[Width] AS DECIMAL(20, 4))
            AND (
                mp.[File] = 1
                OR mp.[File] = 2
                OR [File] = 3
                )
            )

        /* separate query here? */
        OR (
            r.[Company] = 'CAN'
            AND r.[Shape] IN ('SQ', 'NO')
            AND r.[SHAPE] = mp.[Shape]
            AND r.[MASTERNUM] IN (
                SELECT value
                FROM STRING_SPLIT(mp.[mNum], ';')
                )
            AND CAST(oa.[LENGTH] AS DECIMAL(20, 4)) = CAST(mp.[Length] AS DECIMAL(20, 4))
            AND (
                mp.[File] = 1
                OR mp.[File] = 2
                OR [File] = 3
                )
            )

        /* separate query here? */
        OR (
            r.[Company] = 'US'
            AND PATINDEX('%003[A-TV-Za-tv-z]%', r.[MASTERNUM]) = 1
            AND (
                R.[Run] IN (
                    SELECT value
                    FROM STRING_SPLIT(mp.[Run Number], ';')
                    )
                OR r.[Run] IN (
                    SELECT value
                    FROM STRING_SPLIT(mp.[Indexer], ';')
                    )
                )
            )

        /* separate query here? */
        OR (
            r.[Company] = 'US'
            AND r.[MASTERNUM] LIKE '003U%'
            AND r.[RDI] = mp.[RDI]
            AND (
                [File] = '4'
                OR [File] = '4'
                )
            )

        /* separate query here? */
        OR (
            r.[Company] = 'US'
            AND (
                r.[MASTERNUM] IN (
                    SELECT value
                    FROM STRING_SPLIT(mp.[Run Number], ';')
                    )
                OR r.[MASTERNUM] IN (
                    SELECT value
                    FROM STRING_SPLIT(mp.[Indexer], ';')
                    )
                )
            )
        );

您很快就会发现,这是一个非常笨拙的连接,而且性能会很差。几乎可以肯定,最好将其分解为单独的查询,然后通过将它们联合在一起来组合。

我还有一个特别关心的问题,即使您将其分解为多个查询,也是您在多个地方对 STRING_SPLIT 的依赖。这些连接条件不会通过索引得到帮助,并且可能会导致表扫描,即几乎可以肯定这些 STRING_SPLIT 条件会导致性能不佳。

我会尽量避免在主表中使用连接列 - 但如果不重新建模,战术方法可能是准备一个索引临时表。或者(也在评论中建议)使用应用运算符(酌情交叉应用或外部应用)来简化访问主表的串联列详细信息的方式,并在 CTE 中执行此操作以避免代价高昂的重复。

© www.soinside.com 2019 - 2024. All rights reserved.