如何在 SQL Server 中编写 CTE 以获取 Type = NA 的人员的前一行的日期。
如果前面有多个行带有 NA,则按 dt_eff asc 排序时取第一行。
如果在记录开始和 type = NA 之间存在任何其他类型,则应忽略这些记录。仅当上一类型结束时才考虑同一人的其他类型(人员 124 场景)。
源数据
Person Type dt_eff
123 A 2018-10-23 <Start of record >
123 NA 2018-11-19 <Should be the end date for above and dont select this in output>
123 NA 2018-12-25 <dont select this in output>
124 A 2020-01-01 <Start of record >
124 B 2020-02-15 <Ignore and dont select in output>
124 NA 2020-05-14 <Should be the end date for start of record and dont select in op>
124 C 2020-10-13 <As the above start record has ended this should be new start>
124 NA 2021-01-15 <should be the end date for second start record>
124 A 2021-05-22 <As the above start record has ended this should be new start>
124 T 2021-08-22 <Ignored and dont select in output>
456 NA 2022-04-19 <Ignore as there is no lag record with valid type>
456 A 2022-05-01 <Start of record and null as end date as there is no type = NA>
456 B 2022-07-15 <Ignore>
预期产出
Person Type dt_start dt_end
123 A 2018-10-23 2018-11-19
124 A 2020-01-01 2020-05-14
124 C 2020-10-13 2021-01-15
124 A 2021-05-22 NULL
456 A 2022-05-01 NULL
上述源数据的DML和DDL
CREATE TABLE Person (
Person INTEGER,
Type VARCHAR(3),
dt_eff Date
);
INSERT INTO Person (Person, Type, dt_eff)
VALUES
(123,'A','2018-10-23'),
(123,'NA','2018-11-19'),
(123,'NA','2018-12-25'),
(124,'A','2020-01-01'),
(124,'B','2020-02-15'),
(124,'NA','2020-05-14'),
(124,'C','2020-10-13'),
(124,'NA','2021-01-15'),
(124,'A','2021-05-22'),
(124,'T','2021-08-22'),
(456,'NA','2022-04-19'),
(456,'A','2022-05-01'),
(456,'B','2022-07-15')
尝试
with cte1 as (
select *
, lead(dt_eff) over (partition by Person order by dt_eff) dt_eff_lead
, lag(Type, 1, Type) over (partition by Person order by dt_eff) type_lag
from Person
), cte2 as (
select Person, Type, dt_eff Start_Date
, dt_eff_lead
, sum(case when Type <> type_lag and type='NA' then 1 else 0 end)
over (partition by person order by dt_eff asc
rows between unbounded preceding and current row) TypeGroup
from cte1
)
select Person, Type, Start_Date as dt_start
, max(dt_eff_lead) over (partition by Person, TypeGroup) dt_end
from cte2
where Type<>'NA'
order by Person, Start_Date, Type;
您可以通过运行 NA 行总数来进行分组:
SELECT Person, Type
, dt_start, CASE WHEN cntNA > 0 THEN dt_end END AS dt_end
FROM (
SELECT MIN(dt_eff) OVER(PARTITION BY person, grouping) AS dt_start
, MAX(dt_eff) OVER(PARTITION BY person, grouping) AS dt_end
, COUNT(CASE WHEN type = 'NA' THEN 1 END) OVER(PARTITION BY person, grouping) AS cntNA
, ROW_NUMBER() OVER(PARTITION BY person, grouping ORDER BY dt_eff) AS startrow
, *
FROM (
SELECT *
, COUNT(CASE WHEN Type = 'NA' THEN 1 END) OVER(PARTITION BY Person ORDER BY dt_eff ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS grouping
FROM
(
VALUES (123, N'A', N'2018-10-23')
, (123, N'NA', N'2018-11-19')
, (123, N'NA', N'2018-12-25')
, (124, N'A', N'2020-01-01')
, (124, N'B', N'2020-02-15')
, (124, N'NA', N'2020-05-14')
, (124, N'C', N'2020-10-13')
, (124, N'NA', N'2021-01-15')
, (124, N'A', N'2021-05-22')
, (124, N'T', N'2021-08-22')
, (456, N'NA', N'2022-04-19')
, (456, N'A', N'2022-05-01')
, (456, N'B', N'2022-07-15')
) t (Person,Type,dt_eff)
) x
) x
WHERE startrow = 1
AND type <> 'NA'
COUNT(CASE WHEN Type = 'NA' THEN 1 END) OVER(PARTITION BY Person ORDER BY dt_eff ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
创建上述分组值。 AND 1 PRECEDING
表示第一个 NA 计入前一行,这就是您正在寻找的分组方式。
然后您创建每个分组的最小/最大日期以及第一行,因为您只对每个组的一行感兴趣。
cntNA
包含组中 NA 的数量,因为您需要将那些没有任何 NA 的 dt_end 设为 NULL。
最后,您选择要查找的内容。
CASE WHEN cntNA > 0 THEN dt_end END
创建开放式日期