计算各阶段之间的平均时间差

问题描述 投票:-3回答:3

如何计算每个阶段之间的平均时间差。

与实际数据集所面临的挑战是不是每个ID将经历的各个阶段..有些人会跳过一些阶段和日期并不适用于所有标识的像下面连续。

id    date        status
1     1/1/18      requirement
1     1/8/18      analysis
1     ?           design
1     1/30/18     closed
2     2/1/18      requirement
2     2/18/18     closed
3     1/2/18      requirement
3     1/29/18     analysis
3     ?           accepted 
3     2/5/18      closed

? - 我们缺少的日期,以及

Expected output

id    date        status      time_spent
1     1/1/18      requirement   0
1     1/8/18      analysis      7
1     ?           design       
1     1/30/18     closed        22
2     2/1/18      requirement   0
2     2/18/18     closed         17
3     1/2/18      requirement    0
3     1/29/18     analysis       27
3     ?           accepted       
3     2/5/18      closed         24      

status         avg(timespent)
requirement     0
analysis        17
design    
closed          21
oracle
3个回答
0
投票

您可以使用窗口函数LAG(或LEAD)获得的每个ID的一个(或下)状态的数据。这将让你计算在每个阶段所经过的时间。然后,经过计算每个阶段的平均时间。

这里是如何做到这一点的例子:

with input_data (id, dte, status) as (
SELECT 1, TO_DATE('1/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/8/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/30/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/18/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/2/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/29/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE('2/5/18','MM/DD/YY'), 'closed' FROM DUAL ),
----- Solution begins here
data_with_elapsed_days as (
SELECT id.*, dte-nvl(lag(dte ignore nulls) over ( partition by id order by dte ), dte) elapsed
from input_data id)
SELECT status, avg(elapsed)
FROM data_with_elapsed_days d
group by status
order by decode(status,'requirement',1,'analysis',2,'design',3,'accepted',4,'closed',5,99);


+-------------+-------------------------------------------+
|   STATUS    |               AVG(ELAPSED)                |
+-------------+-------------------------------------------+
| requirement |                                         0 |
| analysis    |                                        17 |
| design      |                                           |
| accepted    |                                           |
| closed      | 15.33333333333333333333333333333333333333 |
+-------------+-------------------------------------------+

正如我在我的评论说,这个逻辑计算经过几天的时间从之前的状态给定的状态。因为,“规定”没有之前的状态,这种逻辑将始终显示在需求度过了0天。它可能会更好地从给定的状态到下一个状态计算时间。对于“关闭”,就没有下一个状态。你可以只留下空白或使用SYSDATE作为下一状态的数据。下面是一个例子:

with input_data (id, dte, status) as (
SELECT 1, TO_DATE('1/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/8/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 1, NULL, 'design' FROM DUAL UNION ALL
SELECT 1, TO_DATE('1/30/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/1/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE('2/18/18','MM/DD/YY'), 'closed' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/2/18','MM/DD/YY'), 'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE('1/29/18','MM/DD/YY'), 'analysis' FROM DUAL UNION ALL
SELECT 3, NULL, 'accepted' FROM DUAL UNION ALL
SELECT 3, TO_DATE('2/5/18','MM/DD/YY'), 'closed' FROM DUAL ),
----- Solution begins here
data_with_elapsed_days as (
SELECT id.*, nvl(lead(dte ignore nulls) over ( partition by id order by dte ), trunc(sysdate))-dte elapsed
from input_data id)
SELECT status, avg(elapsed)
FROM data_with_elapsed_days d
group by status
order by decode(status,'requirement',1,'analysis',2,'design',3,'accepted',4,'closed',5,99);



+-------------+------------------------------------------+
|   STATUS    |               AVG(ELAPSED)               |
+-------------+------------------------------------------+
| requirement |                                       17 |
| analysis    |                                     14.5 |
| design      |                                          |
| accepted    |                                          |
| closed      | 361.666666666666666666666666666666666667 |
+-------------+------------------------------------------+

0
投票

我同意@MatthewMcPeak。您的要求似乎有点冤枉:你花requirement阶段的零天,但花费21天的平均上closed? Fnord。

该解决方案对待呈现日期作为阶段的开始日期,并计算其与下一阶段的起始日期之间的差。

with cte as (
    select status
           , lead(dd ignore nulls) over (partition by id order by dd) - dd as dt_diff
    from your_table)
select status, avg(dt_diff) as avg_ela
from cte
group by status
/

0
投票

如果您希望包括每个d所有阶段,估计每个花(使用线性插值)的时间,那么你可以创建一个子查询的所有状态,并使用PARTITION OUTER JOIN加入他们,然后用LAGLEAD找到日期范围内的地位是在和之间进行插值:

甲骨文设置:

CREATE TABLE data ( d, dt, status ) AS
SELECT 1, TO_DATE( '1/1/18', 'MM/DD/YY' ),  'requirement' FROM DUAL UNION ALL
SELECT 1, TO_DATE( '1/8/18', 'MM/DD/YY' ),  'analysis'    FROM DUAL UNION ALL
SELECT 1, NULL,                             'design'      FROM DUAL UNION ALL
SELECT 1, TO_DATE( '1/30/18', 'MM/DD/YY' ), 'closed'      FROM DUAL UNION ALL
SELECT 2, TO_DATE( '2/1/18', 'MM/DD/YY' ),  'requirement' FROM DUAL UNION ALL
SELECT 2, TO_DATE( '2/18/18', 'MM/DD/YY' ), 'closed'      FROM DUAL UNION ALL
SELECT 3, TO_DATE( '1/2/18', 'MM/DD/YY' ),  'requirement' FROM DUAL UNION ALL
SELECT 3, TO_DATE( '1/29/18', 'MM/DD/YY' ), 'analysis'    FROM DUAL UNION ALL
SELECT 3, NULL,                             'accepted'    FROM DUAL UNION ALL
SELECT 3, TO_DATE( '2/5/18', 'MM/DD/YY' ),  'closed'      FROM DUAL;

查询:

WITH statuses ( status, id ) AS (
  SELECT 'requirement', 1 FROM DUAL UNION ALL
  SELECT 'analysis',    2 FROM DUAL UNION ALL
  SELECT 'design',      3 FROM DUAL UNION ALL
  SELECT 'accepted',    4 FROM DUAL UNION ALL
  SELECT 'closed',      5 FROM DUAL
),
ranges ( d, dt, status, id, recent_dt, recent_id, next_dt, next_id ) AS (
  SELECT d.d,
         d.dt,
         s.status,
         s.id,
         NVL(
           d.dt,
           LAG( d.dt, 1 )
             IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
         ),
         NVL2(
           d.dt,
           s.id,
           LAG( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1 )
             IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
         ),
         LEAD( d.dt, 1, d.dt )
           IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id ),
         LEAD( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1, s.id + 1 )
           IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
  FROM   data d
         PARTITION BY ( d )
         RIGHT OUTER JOIN statuses s
         ON ( d.status = s.status )
)
SELECT d,
       dt,
       status,
       ( next_dt - recent_dt ) / (next_id - recent_id ) AS estimated_duration
FROM   ranges;

输出:

 D | DT        | STATUS      |                       ESTIMATED_DURATION
-: | :-------- | :---------- | ---------------------------------------:
 1 | 01-JAN-18 | requirement |                                        7
 1 | 08-JAN-18 | analysis    | 7.33333333333333333333333333333333333333
 1 | null      | design      | 7.33333333333333333333333333333333333333
 1 | null      | accepted    | 7.33333333333333333333333333333333333333
 1 | 30-JAN-18 | closed      |                                        0
 2 | 01-FEB-18 | requirement |                                     4.25
 2 | null      | analysis    |                                     4.25
 2 | null      | design      |                                     4.25
 2 | null      | accepted    |                                     4.25
 2 | 18-FEB-18 | closed      |                                        0
 3 | 02-JAN-18 | requirement |                                       27
 3 | 29-JAN-18 | analysis    | 2.33333333333333333333333333333333333333
 3 | null      | design      | 2.33333333333333333333333333333333333333
 3 | null      | accepted    | 2.33333333333333333333333333333333333333
 3 | 05-FEB-18 | closed      |                                        0

问题2:

然后,你可以很容易地改变,要取平均值对各状态:

WITH statuses ( status, id ) AS (
  SELECT 'requirement', 1 FROM DUAL UNION ALL
  SELECT 'analysis',    2 FROM DUAL UNION ALL
  SELECT 'design',      3 FROM DUAL UNION ALL
  SELECT 'accepted',    4 FROM DUAL UNION ALL
  SELECT 'closed',      5 FROM DUAL
),
ranges ( d, dt, status, id, recent_dt, recent_id, next_dt, next_id ) AS (
  SELECT d.d,
         d.dt,
         s.status,
         s.id,
         NVL(
           d.dt,
           LAG( d.dt, 1 )
             IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
         ),
         NVL2(
           d.dt,
           s.id,
           LAG( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1 )
             IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
         ),
         LEAD( d.dt, 1, d.dt )
           IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id ),
         LEAD( CASE WHEN d.dt IS NOT NULL THEN s.id END, 1, s.id + 1 )
           IGNORE NULLS OVER ( PARTITION BY d.d ORDER BY s.id )
  FROM   data d
         PARTITION BY ( d )
         RIGHT OUTER JOIN statuses s
         ON ( d.status = s.status )
)
SELECT status,
       AVG( ( next_dt - recent_dt ) / (next_id - recent_id ) ) AS estimated_duration
FROM   ranges
GROUP BY status, id
ORDER BY id;

结果:

STATUS      |                       ESTIMATED_DURATION
:---------- | ---------------------------------------:
requirement |                                    12.75
analysis    | 4.63888888888888888888888888888888888889
design      | 4.63888888888888888888888888888888888889
accepted    | 4.63888888888888888888888888888888888889
closed      |                                        0

分贝<>小提琴here

© www.soinside.com 2019 - 2024. All rights reserved.