我有一个如下所示的Employee数据集。开始日期是员工的雇用日期,结束日期是员工离开雇主的日期。两者均为日期数据类型。
Gender StartDate EndDate
M 2010-04-30 2013-06-18
F 2010-01-09 2015-06-19
M 2009-09-08 2014-08-13
我想根据平均工作月数将员工数据划分为四分位数。结果还包括雇员总数(“雇员”列),男性雇员的百分比,女性雇员的百分比以及平均受雇月数。请在下面找到预期的结果:
Quartile Employee %Male %Female AvgMonths
1 25 50.00 50.00 10
2 25 50.00 50.00 10
请在下面找到我的查询,我不知道在哪里可以将四分位数计算添加到查询中。
declare @current date;
set @current='2012-12-31';
select count(*) as Employees,
cast(cast(AVG(CASE WHEN Gender = 'M' THEN 1.0 ELSE 0 END)*100 as decimal(18,2)) as nvarchar(5)) +'%' as Male,
cast(cast(AVG(CASE WHEN Gender = 'F' THEN 1.0 ELSE 0 END)*100 as decimal(18,2)) as nvarchar(5)) +'%' as Female,
AVG(CASE WHEN EndDate is null then DATEDIFF(MONTH, StartDate, @current)
when EndDate is not null then DATEDIFF(MONTH, StartDate, EndDate)
end) as AvgMonths
from dbo.DimEmployee
您可以使用rank()
和count(*)
作为窗口函数来计算四分位数:
select floor( rnk * 4.0 / cnt ) as quartile,
count(*) as Employees,
cast(cast(AVG(CASE WHEN Gender = 'M' THEN 1.0 ELSE 0 END)*100 as decimal(18,2)) as nvarchar(5)) + '%' as Male,
cast(cast(AVG(CASE WHEN Gender = 'F' THEN 1.0 ELSE 0 END)*100 as decimal(18,2)) as nvarchar(5)) + '%' as Female,
AVG(num_months * 1.0) as AvgMonths
from (select e.*,
rank() over (partition by datediff(month, startdate, coalesce(enddate, @current))) - 1 as rnk,
count(*) over () as cnt,
datediff(month, startdate, coalesce(enddate, @current)) as num_months
from dbo.DimEmployee e
) e
group by floor( rnk * 4.0 / cnt )