根据sql中的另一列选择最小或最大日期

问题描述 投票:0回答:1

我有以下数据集:

DROP TABLE IF EXISTS #df

CREATE TABLE #df 
(
    PTID VARCHAR(10),
    HospitalID VARCHAR(5),
    Procedure_Dt date,
    Check_In_Dt DATE,
);

INSERT INTO #df (PTID, HospitalID, Procedure_Dt, Check_In_Dt)
VALUES
('X0001', 'WY', '2021-07-25', '2021-07-23'),
('X0001', 'WY', '2021-07-25', '2021-10-24'),
('X0001', 'WY', '2021-07-25', '2021-10-27'),
('X0001', 'WY', '2021-07-25', '2021-06-24'),
('X0001', 'WY', '2021-07-25', '2022-06-10'),
('X0002', 'CA', '2022-08-25', '2022-08-26'),
('X0002', 'CA', '2022-08-25', '2022-08-27'),
('X0002', 'CA', '2022-08-25', '2022-08-29'),
('X0002', 'CA', '2022-08-25', '2022-09-22'),
('X0003', 'AL', '2023-02-02', NULL)

--SELECT * FROM #df

DROP TABLE IF EXISTS #df_datediff

;WITH CTE_datediff AS --Using only most recent quarter and year
(
    SELECT PTID
             , HospitalID
             , Procedure_Dt
             , Check_In_Dt
    FROM #df
)
SELECT DISTINCT a.PTID
             , HospitalID
             , Procedure_Dt
             , Check_In_Dt
     , DATEDIFF(dd, CAST(Check_In_Dt AS DATE), Procedure_Dt) AS Date_Diff
INTO #df_datediff
FROM CTE_datediff a

我希望能够选择最接近手术日期的

Check_In_Date
。然而,这变得复杂,因为一些入住日期在手术日期之后,有些在手术日期之前。

最终我想要下面的最终数据集:

DROP TABLE IF EXISTS #df_final

CREATE TABLE #df_final 
(
    PTID VARCHAR(10),
    HospitalID VARCHAR(5),
    Procedure_Dt date,
    Check_In_Dt DATE,
    Date_Diff smallint
);

INSERT INTO #df_final (PTID, HospitalID, Procedure_Dt, Check_In_Dt, Date_Diff)
VALUES
('X0001', 'WY', '2021-07-25', '2021-07-23', 2),
('X0002', 'CA', '2022-08-25', '2022-08-26', -1)
('X0003', 'AL', '2023-02-02', NULL, NULL)

我试图通过编写以下代码来做到这一点:

SELECT a.PTID, HospitalID
             , Procedure_Dt
             , Check_In_Dt
             , a.Date_Diff
FROM #df_datediff a
    JOIN (SELECT PTID, MIN(Check_In_Dt) AS Check_In_Date FROM #df_datediff GROUP BY PTID) B
        ON a.PTID = B.PTID
           AND a.Check_In_Dt = B.Check_In_Date
UNION /*Since using MAX in the above query removes Null Facesheets, we use this union to include the null facesheet accesses*/
SELECT a.PTID, HospitalID
             , Procedure_Dt
             , Check_In_Dt
             , a.Date_Diff
FROM #df_datediff a
WHERE Check_In_Dt IS NULL;

问题是,这为 PTID X0001 选择了“2021-06-24”的签入日期,而对于 PTID X0002,它选择了正确的最小负值“2022-08-26”。对于 X0001,应该选择“2021-07-23”

我的目标是将检查日期保持在手术前 0-40 天作为分子。分子中不应考虑所有其他入住日期。

任何提示将不胜感激。

sql-server date datediff
1个回答
0
投票

这几乎是一个 top-n-per-group 类型的查询,尝试以下操作:

select Ptid, HospitalId, Procedure_Dt, Check_In_Dt
from (
  select * , 
    Row_Number() 
      over(partition by ptid, HospitalId 
               order by Abs(DateDiff(day, Procedure_Dt, Check_In_Dt))
      ) rn
  from #df
)t
where rn = 1 and Check_In_Dt is not null;
© www.soinside.com 2019 - 2024. All rights reserved.