我处于每台机器都有两个计数的情况,p
和r
。
p
应该始终大于或等于r
,但是由于技术滞后和较短的汇总周期,情况并非总是如此,r
计数经常-但并非总是-显示前一时期的数据。由于滞后时间的长度不是恒定的,因此无法确切知道r
值属于哪个周期。因此,我不能简单地将所有r
计数在时间上均匀地向后移动,因为这可能会产生以前没有的其他差异。
这种情况无法更改,我必须按原样处理数据。
[在下面的示例中,您可以看到p
在机器1
上短暂计数为“暂停”,而在机器2
上显着减慢,但是r
计数继续返回的值大于p
短暂停留,然后也“暂停”:
-- Dummy data
declare @t table(d date,m int,p int,r int);
insert into @t values(getdate()-9,1,100,10),(getdate()-8,1,90 ,10),(getdate()-7,1,70 ,10),(getdate()-6,1,70 ,10),(getdate()-5,1,80 ,10),(getdate()-4,1,50 ,10),(getdate()-3,1,10 ,10),(getdate()-2,1,0 ,10),(getdate()-1,1,0 ,10),(getdate()+0,1,0 ,10),(getdate()+1,1,0 ,0),(getdate()+2,1,0 ,0),(getdate()+3,1,40 ,0),(getdate()+4,1,50 ,0),(getdate()+5,1,80 ,10),(getdate()-9,2,1100,100),(getdate()-8,2,190 ,100),(getdate()-7,2,170 ,100),(getdate()-6,2,170 ,100),(getdate()-5,2,180 ,100),(getdate()-4,2,150 ,100),(getdate()-3,2,110 ,100),(getdate()-2,2,10 ,100),(getdate()-1,2,10 ,100),(getdate()+0,2,10 ,100),(getdate()+1,2,10 ,0),(getdate()+2,2,10 ,0),(getdate()+3,2,140 ,0),(getdate()+4,2,150 ,0),(getdate()+5,2,180 ,100);
select * from @t order by m,d;
-- Output
+------------+---+------+-----+
| d | m | p | r |
+------------+---+------+-----+
| 2020-05-27 | 1 | 100 | 10 |
| 2020-05-28 | 1 | 90 | 10 |
| 2020-05-29 | 1 | 70 | 10 |
| 2020-05-30 | 1 | 70 | 10 |
| 2020-05-31 | 1 | 80 | 10 |
| 2020-06-01 | 1 | 50 | 10 |
| 2020-06-02 | 1 | 10 | 10 |
| 2020-06-03 | 1 | 0 | 10 |
| 2020-06-04 | 1 | 0 | 10 |
| 2020-06-05 | 1 | 0 | 10 |
| 2020-06-06 | 1 | 0 | 0 |
| 2020-06-07 | 1 | 0 | 0 |
| 2020-06-08 | 1 | 40 | 0 |
| 2020-06-09 | 1 | 50 | 0 |
| 2020-06-10 | 1 | 80 | 10 |
| 2020-05-27 | 2 | 1100 | 100 |
| 2020-05-28 | 2 | 190 | 100 |
| 2020-05-29 | 2 | 170 | 100 |
| 2020-05-30 | 2 | 170 | 100 |
| 2020-05-31 | 2 | 180 | 100 |
| 2020-06-01 | 2 | 150 | 100 |
| 2020-06-02 | 2 | 110 | 100 |
| 2020-06-03 | 2 | 10 | 100 |
| 2020-06-04 | 2 | 10 | 100 |
| 2020-06-05 | 2 | 10 | 100 |
| 2020-06-06 | 2 | 10 | 0 |
| 2020-06-07 | 2 | 10 | 0 |
| 2020-06-08 | 2 | 140 | 0 |
| 2020-06-09 | 2 | 150 | 0 |
| 2020-06-10 | 2 | 180 | 100 |
+------------+---+------+-----+
我需要能够在一定程度上合理地向后调整那些r
计数,以便将它们以使每个p
数大于或等于相应r
值的方式添加到先前的行中。
在上面的m = 1
例子中,输出看起来像以下r
个计数的any一样;我不在乎调整的范围,仅在每一行都使用p
>=
r
:
+------------+---+------+------+------+------+
| d | m | p | r1 | r2 | r3 |
+------------+---+------+------+------+------+
| 2020-05-27 | 1 | 100 | 10 | 10 | 10 |
| 2020-05-28 | 1 | 90 | 10 | 10 | 10 |
| 2020-05-29 | 1 | 70 | 10 | 15 | 10 |
| 2020-05-30 | 1 | 70 | 20 | 20 | 10 |) Note how the original 30 r counts
| 2020-05-31 | 1 | 80 | 20 | 20 | 10 |} that didn't follow the rule
| 2020-06-01 | 1 | 50 | 20 | 15 | 40 |) have been moved back in time
| 2020-06-02 | 1 | 10 | 10 | 10 | 10 |
| 2020-06-03 | 1 | 0 | 0 | 0 | 0 |
| 2020-06-04 | 1 | 0 | 0 | 0 | 0 |
| 2020-06-05 | 1 | 0 | 0 | 0 | 0 |
| 2020-06-06 | 1 | 0 | 0 | 0 | 0 |
| 2020-06-07 | 1 | 0 | 0 | 0 | 0 |
| 2020-06-08 | 1 | 40 | 0 | 0 | 0 |
| 2020-06-09 | 1 | 50 | 0 | 0 | 0 |
| 2020-06-10 | 1 | 80 | 10 | 10 | 10 |
+------------+---+------+------+------+------+
我已经尝试使用窗口函数和rows between
等解决此问题,但是我不知道如何确定需要重新分配给先前期间的r
值,以及确定哪个p
分配给它们的值。如果我取得了任何进展,我将在下面添加它,但是非常感谢所有帮助。
我管理的最接近的是以下适用于上面的方法,但是当您将p = 50
的值更改为小于40
的值并且在我只想向后调整时间的同时向前和向后调整时,都会失败:] >
with t as( select row_number() over (partition by m order by d) as rn ,(row_number() over (partition by m order by d)-1) / 5 as gn ,* from @t where m = 1 ) select * ,case when p > r then r + (sum(case when p < r then r else 0 end) over (partition by gn) / sum(case when p > r then 1 else 0 end) over (partition by gn)) else case when p = r then r else 0 end end as r_adj from t;
尝试2
距离更近,但仍在向前和向后调整:
with t as(
select row_number() over (partition by m order by d) as rn
,(row_number() over (partition by m order by d)-1) / 10 as gn
,(row_number() over (partition by m order by d)+4) / 10 as gn2
,*
from @t
where m = 1
)
,r1 as(
select *
,case when p > r
then r + (sum(case when p < r then r - p else 0 end) over (partition by gn) / sum(case when p > r then 1. else 0. end) over (partition by gn))
else case when p = r
then r
else 0
end
end as r_adj
from t
)
select d
,m
,p
,r
,case when p > r_adj
then r_adj + (sum(case when p < r_adj then r_adj - p else 0 end) over (partition by gn2) / sum(case when p > r_adj then 1. else 0. end) over (partition by gn2))
else case when p = r_adj
then r_adj
else r_adj - (r_adj - p)
end
end as r_new
from r1
order by rn
;
我处于每台机器都有两个计数p和r的情况。 p应该始终大于或等于r,但是由于技术滞后和较短的汇总周期,这并不总是... ...>
一种方法使用apply
:
select t.*,
t2.r as imputed_r
from t outer apply
(select top (1) t2.*
from t t2
where t2.m = t.m and
t2.d >= t.d and t2.r <= t.p
order by t2.d desc
) t2;