我有下表:
DROP TABLE IF EXISTS #df
CREATE TABLE #df
(
CommID VARCHAR(10),
ProvID VARCHAR(5),
VisitCount int,
[% Score] INT,
TimePeriod VARCHAR(10),
Median_VisitCount INT,
Average_VisitCount INT
);
INSERT INTO #df (CommID, ProvID, VisitCount, [% Score], TimePeriod, Median_VisitCount, Average_VisitCount)
VALUES
('AB345', '001', '65', .45, 'ThisYear', 48.5, 42),
('AB345', '002', '67', .64, 'ThisYear', 48.5, 42),
('AB345', '003', '32', .78, 'ThisYear', 48.5, 42),
('AB345', '004', '4', .32, 'ThisYear', 48.5, 42),
('AB345', '001', '23', .45, 'LastYear', 42.5, 41),
('AB345', '002', '56', .64, 'LastYear', 48.5, 41),
('AB345', '003', '31', .78, 'LastYear', 48.5, 41),
('AB345', '004', '54', .32, 'LastYear', 48.5, 41)
SELECT * FROM #df
希望我的最终输出是这样的:
DROP TABLE IF EXISTS #final
CREATE TABLE #final
(
CommID VARCHAR(10),
ProvID VARCHAR(5),
VisitCount int,
[% Score] INT,
TimePeriod VARCHAR(10),
Median_VisitCount INT,
Average_VisitCount INT,
Highest INT,
Lowest INT
);
INSERT INTO #final (CommID, ProvID, VisitCount, [% Score], TimePeriod, Median_VisitCount, Average_VisitCount, Highest, Lowest)
VALUES
('AB345', '001', '65', .45, 'ThisYear', 48.5, 42, 3, 1),
('AB345', '002', '67', .64, 'ThisYear', 48.5, 42, 2, 2),
('AB345', '003', '32', .78, 'ThisYear', 48.5, 42, 1, 3),
('AB345', '004', '4', .32, 'ThisYear', 48.5, 42, NULL, NULL),
('AB345', '001', '23', .45, 'LastYear', 42.5, 41, NULL, NULL),
('AB345', '002', '56', .64, 'LastYear', 48.5, 41, 1, 2),
('AB345', '003', '31', .78, 'LastYear', 48.5, 41, NULL, NULL),
('AB345', '004', '54', .32, 'LastYear', 48.5, 41, 2, 1)
SELECT * FROM #final
对于给定的 CommID 和 TimePeriod,我想对 [% Score] 进行排名,但仅限于 [VisitCounts] >= Average_VisitCount。
这是我编写的代码,但排名仍在考虑低于 Average_VisitCount 的值。我希望访问次数小于 AverageVisit 次数的任何行都不会被考虑在排名中:
SELECT a.CommID
, a.ProvID
, a.VisitCount
, a.[% Score]
, CASE WHEN VisitCount >= a.Average_VisitCount
THEN RANK() OVER (PARTITION BY a.CommID, TimePeriod ORDER BY [a].[% Score] DESC)
ELSE NULL END AS Highest
, CASE WHEN VisitCount >= a.Average_VisitCount
THEN RANK() OVER (PARTITION BY a.CommID, TimePeriod ORDER BY [a].[% Score])
ELSE NULL END AS Lowest
, a.TimePeriod
, a.Median_VisitCount
, a.Average_VisitCount
FROM #df a
ORDER BY a.CommID, a.TimePeriod, a.VisitCount DESC
您的演示数据将
[% Score]
指定为 INT
,所以我将其更改为 DECIMAL(5,2)
,否则每行都会得分 0
。
您已经完成了大部分工作。基本上,将那些超出您的范围的内容排在底部,这样它们就不会干扰您感兴趣的排名,或者将它们保留在排名的底部,或者使它们显示 NULL:
SELECT *, CASE WHEN VisitCount >= Average_VisitCount THEN DENSE_RANK() OVER (PARTITION BY CommID, TimePeriod ORDER BY CASE WHEN VisitCount >= Average_VisitCount THEN [% Score] ELSE 999 END) END
FROM #df
内部
CASE
表达式使用值 999(某个任意超出范围的值)对它们进行排名,外部 CASE
表达式导致列对这些值返回 NULL。
通讯ID | ProvID | 访问次数 | % 分数 | 时间段 | Median_VisitCount | 平均_访问次数 | 排名 |
---|---|---|---|---|---|---|---|
AB345 | 004 | 54 | 0.32 | 去年 | 48 | 41 | 1 |
AB345 | 002 | 56 | 0.64 | 去年 | 48 | 41 | 2 |
AB345 | 003 | 31 | 0.78 | 去年 | 48 | 41 | |
AB345 | 001 | 23 | 0.45 | 去年 | 42 | 41 | |
AB345 | 001 | 65 | 0.45 | 今年 | 48 | 42 | 1 |
AB345 | 002 | 67 | 0.64 | 今年 | 48 | 42 | 2 |
AB345 | 003 | 32 | 0.78 | 今年 | 48 | 42 | |
AB345 | 004 | 4 | 0.32 | 今年 | 48 | 42 |