我有一个表来记录员工是否已上班和下班,但将它们作为单独的行进行跟踪。
输入:
+------------+------------+---------------+
| EmployeeID | Event_Date | Event_Type |
+------------+------------+---------------+
| 2450770 | 2020/01/02 | 'Clocked Out' | -- Doesn't have a clocked in time within desired time range
| 2195326 | 2020/01/06 | 'Clocked In' |
| 2195326 | 2020/01/10 | 'Clocked Out' |
| 800455 | 2020/01/15 | 'Clocked In' |
| 2450770 | 2020/01/15 | 'Clocked In' | -- No clock out time yet
| 800455 | 2020/01/22 | 'Clocked Out' |
| 2195326 | 2020/01/23 | 'Clocked In' |
| 2331340 | 2020/01/25 | 'Clocked In' |
| 2195326 | 2020/01/27 | 'Clocked Out' |
| 2331340 | 2020/02/01 | 'Clocked Out' |
| 2515957 | 2020/02/05 | 'Clocked In' |
| | | | -- etc
所需输出:
+------------+------------+-------------+
| EmployeeID | Clocked_In | Clocked_Out |
+------------+------------+-------------+
| 2195326 | 2020/01/06 | 2020/01/10 |
| 800455 | 2020/01/15 | 2020/01/22 |
| 2450770 | 2020/01/15 | NULL |
| 2195326 | 2020/01/23 | 2020/01/27 |
| 2331340 | 2020/01/25 | 2020/02/01 |
+------------+------------+-------------+
这个问题似乎很简单,但我还不太清楚:我如何将那些Clocked In和Clocked Out事件放在一起,这样我就可以看到有人在一月份进场,并带有相应的进场日期( (如果存在)没有日期限制?并发症包括没有匹配上班日期和下班日期的人员,在不同月份内上班和下班的人员,一个月内多次上班/下班的人员。据我所知,没有人连续排两次(或连续排两次)的情况。
类似的操作会获取数据的行号,并按EmployeeID进行分区,并按Event_Date进行排序。一旦有了,就加入员工编号和下一行编号。
通过这种方式可以处理大部分(即使不是全部)并发症,包括输入日期大于输出日期。
DROP TABLE IF EXISTS #temp
CREATE TABLE #temp (EmployeeID VARCHAR(256), Event_Date DATE, Event_Type varchar(256))
INSERT INTO #temp
VALUES
( '2450770', '2020/01/02', 'Clocked Out'),
( '2195326', '2020/01/06', 'Clocked In'),
( '2195326', '2020/01/10', 'Clocked Out'),
( '800455', '2020/01/15', 'Clocked In'),
( '2450770', '2020/01/15', 'Clocked In'),
( '800455', '2020/01/22', 'Clocked Out'),
( '2195326', '2020/01/23', 'Clocked In'),
( '2331340', '2020/01/25', 'Clocked In'),
( '2195326', '2020/01/27', 'Clocked Out'),
( '2331340', '2020/02/01', 'Clocked Out'),
( '2515957', '2020/02/05', 'Clocked In')
DROP TABLE IF EXISTS #output
SELECT
EmployeeID,
Event_Date,
Event_Type,
ROW_NUMBER() OVER(PARTITION BY EmployeeID ORDER BY Event_Date) AS RN
INTO #output
FROM #temp
SELECT
I.EmployeeID,
I.Event_Date AS [Clocked In],
O.Event_Date AS [Clocked Out]
FROM #output I
LEFT JOIN #output O ON O.EmployeeID + O.RN = I.EmployeeID + (I.RN + 1) AND O.Event_Type = 'Clocked Out'
WHERE I.Event_Type = 'Clocked In'