如何加入第一行

问题描述 投票:773回答:11

我将使用一个具体的但假设的示例。

每个订单通常只有一个订单项

[订单:

OrderGUID   OrderNumber
=========   ============
{FFB2...}   STL-7442-1      
{3EC6...}   MPT-9931-8A

LineItems:

LineItemGUID   Order ID Quantity   Description
============   ======== ========   =================================
{098FBE3...}   1        7          prefabulated amulite
{1609B09...}   2        32         spurving bearing

但是偶尔会有一个包含两个订单项的订单:

LineItemID   Order ID    Quantity   Description
==========   ========    ========   =================================
{A58A1...}   6,784,329   5          pentametric fan
{0E9BC...}   6,784,329   5          differential girdlespring 

通常向用户显示订单时:

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
    INNER JOIN LineItems 
    ON Orders.OrderID = LineItems.OrderID

我想在订单上显示单个项目。但是,如果这个偶然的订单包含两个(或多个)项目,则这些订单<OrderNumber Quantity Description =========== ======== ==================== STL-7442-1 7 prefabulated amulite MPT-9931-8A 32 spurving bearing KSG-0619-81 5 panametric fan KSG-0619-81 5 differential girdlespring 我真正想要的是拥有SQL Server

只需选择一个],因为它会

足够好

OrderNumber Quantity Description =========== ======== ==================== STL-7442-1 7 prefabulated amulite MPT-9931-8A 32 differential girdlespring KSG-0619-81 5 panametric fan 如果我喜欢冒险,我可以向用户显示一个省略号,以表明有多个:
OrderNumber   Quantity   Description
===========   ========   ====================
STL-7442-1    7          prefabulated amulite
MPT-9931-8A   32         differential girdlespring
KSG-0619-81   5          panametric fan, ...

所以问题是怎么做

消除“重复”行

    仅连接到其中一行,以避免重复
  • 第一次尝试
  • 我的第一次尝试是只加入“

    TOP 1

  • ”行项目:

    SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description FROM Orders INNER JOIN ( SELECT TOP 1 LineItems.Quantity, LineItems.Description FROM LineItems WHERE LineItems.OrderID = Orders.OrderID) LineItems2 ON 1=1

但是那给出了错误:

列或前缀“订单”不与表名或别名匹配在查询中使用。

大概是因为内部选择看不到外部表。

我将使用一个具体的但假设的示例。每个订单通常只有一个订单项:订单:OrderGUID OrderNumber ==================== {FFB2 ...} STL-7442-1 {3EC6 ... } MPT -...

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description FROM Orders JOIN LineItems ON LineItems.LineItemGUID = ( SELECT TOP 1 LineItemGUID FROM LineItems WHERE OrderID = Orders.OrderID )
在SQL Server 2005及更高版本中,您可以将INNER JOIN替换为CROSS APPLY
SELECT  Orders.OrderNumber, LineItems2.Quantity, LineItems2.Description
FROM    Orders
CROSS APPLY
        (
        SELECT  TOP 1 LineItems.Quantity, LineItems.Description
        FROM    LineItems
        WHERE   LineItems.OrderID = Orders.OrderID
        ) LineItems2

[请注意,没有TOP 1ORDER BY不是确定性的:此查询将为您为每个订单提供一个订单项,但未定义是哪个订单项。

多次查询可以为同一订单提供不同的订单项,即使基础没有变化。

如果要确定顺序,则应在最里面的查询中添加ORDER BY子句。

我知道这个问题是在不久前回答的,但是在处理大型数据集时,嵌套查询的成本可能很高。这是另一种解决方案,其中嵌套查询将只运行一次,而不是针对返回的每一行。
SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description FROM Orders INNER JOIN ( SELECT Orders.OrderNumber, Max(LineItem.LineItemID) AS LineItemID FROM Orders INNER JOIN LineItems ON Orders.OrderNumber = LineItems.OrderNumber GROUP BY Orders.OrderNumber ) AS Items ON Orders.OrderNumber = Items.OrderNumber INNER JOIN LineItems ON Items.LineItemID = LineItems.LineItemID

您可以做:
SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description FROM Orders INNER JOIN LineItems ON Orders.OrderID = LineItems.OrderID WHERE LineItems.LineItemID = ( SELECT MIN(LineItemID) FROM LineItems WHERE OrderID = Orders.OrderID )

这需要LineItems.LineItemID上的索引(或主键)和LineItems.OrderID上的索引,否则会很慢。

@ Quassnoi的答案是好的,在某些情况下(尤其是外部表很大),使用窗口函数可能更有效,例如:
SELECT Orders.OrderNumber, LineItems2.Quantity, LineItems2.Description FROM Orders LEFT JOIN ( SELECT LineItems.Quantity, LineItems.Description, OrderId, ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY (SELECT NULL)) AS RowNum FROM LineItems ) LineItems2 ON LineItems2.OrderId = Orders.OrderID And RowNum = 1

有时您只需need to test哪个查询可以提供更好的性能。

,使用公用表表达式的另一个方法:
with firstOnly as ( select Orders.OrderNumber, LineItems.Quantity, LineItems.Description, ROW_NUMBER() over (partiton by Orders.OrderID order by Orders.OrderID) lp FROM Orders join LineItems on Orders.OrderID = LineItems.OrderID ) select * from firstOnly where lp = 1

或者最后,您可能想显示所有连接的行?

此处用逗号分隔的版本:

select * from Orders o cross apply ( select CAST((select l.Description + ',' from LineItems l where l.OrderID = s.OrderID for xml path('')) as nvarchar(max)) l ) lines

从SQL Server 2012起,我认为这可以解决问题:
SELECT DISTINCT o.OrderNumber , FIRST_VALUE(li.Quantity) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Quantity , FIRST_VALUE(li.Description) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Description FROM Orders AS o INNER JOIN LineItems AS li ON o.OrderID = li.OrderID

相关子查询是依赖于外部查询的子查询。就像SQL中的for循环一样。子查询将为外部查询中的每一行运行一次:
select * from users join widgets on widgets.id = ( select id from widgets where widgets.user_id = users.id order by created_at desc limit 1 )

编辑:没关系,夸斯诺伊有一个更好的答案。
对于SQL2K,类似这样:

SELECT Orders.OrderNumber , LineItems.Quantity , LineItems.Description FROM ( SELECT Orders.OrderID , Orders.OrderNumber , FirstLineItemID = ( SELECT TOP 1 LineItemID FROM LineItems WHERE LineItems.OrderID = Orders.OrderID ORDER BY LineItemID -- or whatever else ) FROM Orders ) Orders JOIN LineItems ON LineItems.OrderID = Orders.OrderID AND LineItems.LineItemID = Orders.FirstLineItemID

我最喜欢的运行此查询的方式是使用不存在子句。我相信这是运行这种查询的最有效方法:
select o.OrderNumber, li.Quantity, li.Description from Orders as o inner join LineItems as li on li.OrderID = o.OrderID where not exists ( select 1 from LineItems as li_later where li_later.OrderID = o.OrderID and li_later.LineItemGUID > li.LineItemGUID )

但是我还没有针对这里建议的其他方法测试此方法。

[尝试过十字架,效果很好,但需要更长的时间。调整了行列以具有最大数量并添加了组,从而保持了速度并删除了额外的记录。
这是调整后的查询:

SELECT Orders.OrderNumber, max(LineItems.Quantity), max(LineItems.Description) FROM Orders INNER JOIN LineItems ON Orders.OrderID = LineItems.OrderID Group by Orders.OrderNumber

尝试一下
SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description FROM Orders INNER JOIN ( SELECT Orders.OrderNumber, Max(LineItem.LineItemID) AS LineItemID FROM Orders INNER JOIN LineItems ON Orders.OrderNumber = LineItems.OrderNumber GROUP BY Orders.OrderNumber ) AS Items ON Orders.OrderNumber = Items.OrderNumber INNER JOIN LineItems ON Items.LineItemID = LineItems.LineItemID
sql sql-server tsql sql-server-2000
11个回答
1211
投票
在SQL Server 2005及更高版本中,您可以将INNER JOIN替换为CROSS APPLY

117
投票
SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description FROM Orders INNER JOIN ( SELECT Orders.OrderNumber, Max(LineItem.LineItemID) AS LineItemID FROM Orders INNER JOIN LineItems ON Orders.OrderNumber = LineItems.OrderNumber GROUP BY Orders.OrderNumber ) AS Items ON Orders.OrderNumber = Items.OrderNumber INNER JOIN LineItems ON Items.LineItemID = LineItems.LineItemID

28
投票
SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description FROM Orders INNER JOIN LineItems ON Orders.OrderID = LineItems.OrderID WHERE LineItems.LineItemID = ( SELECT MIN(LineItemID) FROM LineItems WHERE OrderID = Orders.OrderID )

这需要LineItems.LineItemID上的索引(或主键)和LineItems.OrderID上的索引,否则会很慢。


26
投票
SELECT Orders.OrderNumber, LineItems2.Quantity, LineItems2.Description FROM Orders LEFT JOIN ( SELECT LineItems.Quantity, LineItems.Description, OrderId, ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY (SELECT NULL)) AS RowNum FROM LineItems ) LineItems2 ON LineItems2.OrderId = Orders.OrderID And RowNum = 1

有时您只需need to test哪个查询可以提供更好的性能。


14
投票
with firstOnly as ( select Orders.OrderNumber, LineItems.Quantity, LineItems.Description, ROW_NUMBER() over (partiton by Orders.OrderID order by Orders.OrderID) lp FROM Orders join LineItems on Orders.OrderID = LineItems.OrderID ) select * from firstOnly where lp = 1

或者最后,您可能想显示所有连接的行?


13
投票
SELECT DISTINCT o.OrderNumber , FIRST_VALUE(li.Quantity) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Quantity , FIRST_VALUE(li.Description) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Description FROM Orders AS o INNER JOIN LineItems AS li ON o.OrderID = li.OrderID

11
投票
select * from users join widgets on widgets.id = ( select id from widgets where widgets.user_id = users.id order by created_at desc limit 1 )

5
投票
对于SQL2K,类似这样:

SELECT Orders.OrderNumber , LineItems.Quantity , LineItems.Description FROM ( SELECT Orders.OrderID , Orders.OrderNumber , FirstLineItemID = ( SELECT TOP 1 LineItemID FROM LineItems WHERE LineItems.OrderID = Orders.OrderID ORDER BY LineItemID -- or whatever else ) FROM Orders ) Orders JOIN LineItems ON LineItems.OrderID = Orders.OrderID AND LineItems.LineItemID = Orders.FirstLineItemID


4
投票
select o.OrderNumber, li.Quantity, li.Description from Orders as o inner join LineItems as li on li.OrderID = o.OrderID where not exists ( select 1 from LineItems as li_later where li_later.OrderID = o.OrderID and li_later.LineItemGUID > li.LineItemGUID )

但是我还没有针对这里建议的其他方法测试此方法。


2
投票
这是调整后的查询:

SELECT Orders.OrderNumber, max(LineItems.Quantity), max(LineItems.Description) FROM Orders INNER JOIN LineItems ON Orders.OrderID = LineItems.OrderID Group by Orders.OrderNumber


1
投票
SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description FROM Orders INNER JOIN ( SELECT Orders.OrderNumber, Max(LineItem.LineItemID) AS LineItemID FROM Orders INNER JOIN LineItems ON Orders.OrderNumber = LineItems.OrderNumber GROUP BY Orders.OrderNumber ) AS Items ON Orders.OrderNumber = Items.OrderNumber INNER JOIN LineItems ON Items.LineItemID = LineItems.LineItemID
© www.soinside.com 2019 - 2024. All rights reserved.