我有以下数据,我希望能够根据停止类型ID将所有行放入一行。因此停止类型是按顺序,这意味着0或2将在3之前出现。我相信Lead是我想要使用的,但似乎没有像我想要的那样工作并且无法弄清楚为什么。
这是基于GMT日期时间的原始数据。
OrderId GmtDeliveryDateTime StopTypeId
3650 2019-01-11 13:04:44.000 0
3650 2019-01-11 14:22:09.000 3
3650 2019-01-11 15:13:35.000 2
3650 2019-01-11 16:05:14.000 3
我希望它看起来像这样:
OrderId GmtDeliveryDateTime StopTypeId GmtDeliveryDateTime StopTypeId
3650 2019-01-11 13:04:44.000 0 2019-01-11 14:22:09.000 3
3650 2019-01-11 15:13:35.000 2 2019-01-11 16:05:14.000 3
这是我正在使用的查询:
SELECT *
FROM (
SELECT OrderId,
GmtDeliveryDateTime,
StopTypeId,
LEAD(StopTypeId) OVER (ORDER BY GmtDeliveryDateTime, StopTypeId) NxtStop
FROM table
)
以下是上面产生的结果:
OrderId GmtDeliveryDateTime StopTypeId NxtStop
3650 2019-01-11 13:04:44.000 0 2
3650 2019-01-11 15:13:35.000 2 2
3650 2019-01-11 14:22:09.000 3 3
3650 2019-01-11 16:05:14.000 3 2
我的查询有什么问题?
如果可以保证行交错,你可以这样做:
SELECT t.*
FROM (SELECT OrderId,
GmtDeliveryDateTime,
StopTypeId,
LEAD(GmtDeliveryDateTime) OVER (PARTITION BY OrderId ORDER BY GmtDeliveryDateTime, StopTypeId) as next_GmtDeliveryDateTime,
LEAD(StopTypeId) OVER (PARTITION BY OrderId ORDER BY GmtDeliveryDateTime, StopTypeId) as next_StopTypeId
FROM table t
) t
WHERE StopTypeId <> 3;
我知道你正试图将记录分为两个,每个记录与下一个记录,由GmtDeliveryDateTime
排序。
这是一个解决方案,在子查询中使用LAG()
来恢复相关值,ROW_NUMBER()
为每个记录分配一个数字,由GmtDeliveryDateTime
排序。外部查询使用行号过滤掉两个中的一个记录(甚至行号被过滤掉):
SELECT *
FROM (
SELECT
OrderId,
GmtDeliveryDateTime,
StopTypeId,
LEAD(GmtDeliveryDateTime) OVER (ORDER BY GmtDeliveryDateTime) NxtGmtDeliveryDateTime,
LEAD(StopTypeId) OVER (ORDER BY GmtDeliveryDateTime) NxtStopTypeId,
ROW_NUMBER() OVER (ORDER BY GmtDeliveryDateTime) rn
FROM mytable
) x WHERE rn % 2 <> 0
注意:我删除了ORDER BY
上的StopTypeId
,因为你的样本数据没有显示重复的GmtDeliveryDateTime
。
这个带有样本数据的demo on DB Fiddle返回:
<pre>
OrderId | GmtDeliveryDateTime | StopTypeId | NxtGmtDeliveryDateTime | NxtStopTypeId | rn
------: | :------------------ | ---------: | :--------------------- | ------------: | :-
3650 | 11/01/2019 00:00:00 | 0 | 11/01/2019 00:00:00 | 3 | 1
3650 | 11/01/2019 00:00:00 | 2 | 11/01/2019 00:00:00 | 3 | 3
</pre>
你可以尝试下面 -
SELECT OrderId,
MIN(GmtDeliveryDateTime) as starttime,
MIN(StopTypeId) as startStopTypeId,
MAX(GmtDeliveryDateTime) as endtime,
MAX(StopTypeId) as nextStopTypeId
from
(
SELECT t.*,
row_number() over(order by GmtDeliveryDateTime)-
sum(case when StopTypeId=3 then 1 else 0 end) over(partition by OrderId order by GmtDeliveryDateTime) as grp
FROM t1 t
)A group by grp,OrderId
OUTPUT:
OrderId starttime startStopTypeId endtime nextStopTypeId
3650 11/01/2019 13:04:44 0 11/01/2019 14:22:09 3
3650 11/01/2019 15:13:35 2 11/01/2019 16:05:14 3
假设使用0,3
或2,3
作为连续停止id的行标识给定订单ID的组,您可以使用运行总和将连续的0,3或2,3行分类到一个组中,然后使用group by
获得所需的结果。
SELECT OrderId,
MIN(GmtDeliveryDateTime),
MIN(StopTypeId),
MAX(GmtDeliveryDateTime),
MAX(StopTypeId)
FROM (SELECT t.*,sum(case when StopTypeId=3 then 1 else 0 end) over(partition by OrderId order by GmtDeliveryDateTime) as grp
FROM table t
) t
GROUP BY OrderId,grp
我知道其他人已经回答了,但我使用了您的初始查询并稍微修改它以获得您想要的结果:
DROP TABLE IF EXISTS #SO;
CREATE TABLE #SO
(
OrderID INT ,
DeliveryDate DATETIME ,
StopTypeID INT
);
INSERT INTO #SO ( OrderID ,
DeliveryDate ,
StopTypeID )
VALUES ( 3650, '2019-01-11 13:04:44.000', 0 ) ,
( 3650, '2019-01-11 14:22:09.000', 3 ) ,
( 3650, '2019-01-11 15:13:35.000', 2 ) ,
( 3650, '2019-01-11 16:05:14.000', 3 );
SELECT x.OrderID ,
x.DeliveryDate ,
x.StopTypeID ,
x.NxtStop ,
ROW_NUMBER () OVER ( ORDER BY x.DeliveryDate ) AS rownumber
INTO #TestData
FROM
(
SELECT OrderID ,
DeliveryDate ,
StopTypeID ,
LEAD ( StopTypeID ) OVER ( ORDER BY DeliveryDate , StopTypeID ) NxtStop
FROM #SO
) AS x;
SELECT a.OrderID ,
a.DeliveryDate ,
a.StopTypeID ,
b.DeliveryDate ,
b.StopTypeID
FROM #TestData AS a
INNER JOIN #TestData AS b ON b.OrderID = a.OrderID
AND a.NxtStop = b.StopTypeID
AND a.rownumber + 1 = b.rownumber
WHERE a.StopTypeID < b.StopTypeID;
DROP TABLE IF EXISTS #TestData;