我有一个表具有以下信息,我使用的是谷歌的BigQuery。我想照几张不同类型的计算由PERSON_ID聚集,中部和起始点,终点和初始,并关闭和初始之间的天数。
|Person_ID|Action |Date |
|100 |Initial|22/12/2018 |
|100 |Middle |23/12/2018 |
|100 |End |29/12/2018 |
|100 |Close |31/12/2018 |
|150 |Initial|02/01/2019 |
|150 |Middle |04/01/2019 |
|150 |End |07/01/2019 |
|150 |Close |10/01/2019 |
我想,结果落得如下
|Person_ID|Middle_Minus_initial|End_Minus_initial|Close_Minus_initial|
|100 | 1 | 7 | 9 |
|150 | 2 | 5 | 8 |
我真的不知道如何去了解它,就好象我很初学者当涉及到SQL所以任何帮助,将不胜感激。谢谢。
下面是BigQuery的SQL标准
#standardSQL
SELECT Person_ID,
DATE_DIFF(Middle, Initial, DAY) AS Middle_Minus_initial,
DATE_DIFF(`End`, Initial, DAY) AS End_Minus_initial,
DATE_DIFF(Close, Initial, DAY) AS Close_Minus_initial
FROM (
SELECT Person_ID,
PARSE_DATE('%d/%m/%Y', MAX(IF(Action = 'Initial', `Date`, NULL))) AS Initial,
PARSE_DATE('%d/%m/%Y', MAX(IF(Action = 'Middle', `Date`, NULL))) AS Middle,
PARSE_DATE('%d/%m/%Y', MAX(IF(Action = 'End', `Date`, NULL))) AS `End`,
PARSE_DATE('%d/%m/%Y', MAX(IF(Action = 'Close', `Date`, NULL))) AS Close
FROM `project.dataset.table`
GROUP BY Person_ID
)
您可以测试,用样本数据上面从你的问题如下面的例子中玩
#standardSQL
WITH `project.dataset.table` AS (
SELECT 100 Person_ID, 'Initial' Action, '22/12/2018' `Date` UNION ALL
SELECT 100, 'Middle', '23/12/2018' UNION ALL
SELECT 100, 'End', '29/12/2018' UNION ALL
SELECT 100, 'Close', '31/12/2018' UNION ALL
SELECT 150, 'Initial', '02/01/2019' UNION ALL
SELECT 150, 'Middle', '04/01/2019' UNION ALL
SELECT 150, 'End', '07/01/2019' UNION ALL
SELECT 150, 'Close', '10/01/2019'
)
SELECT Person_ID,
DATE_DIFF(Middle, Initial, DAY) AS Middle_Minus_initial,
DATE_DIFF(`End`, Initial, DAY) AS End_Minus_initial,
DATE_DIFF(Close, Initial, DAY) AS Close_Minus_initial
FROM (
SELECT Person_ID,
PARSE_DATE('%d/%m/%Y', MAX(IF(Action = 'Initial', `Date`, NULL))) AS Initial,
PARSE_DATE('%d/%m/%Y', MAX(IF(Action = 'Middle', `Date`, NULL))) AS Middle,
PARSE_DATE('%d/%m/%Y', MAX(IF(Action = 'End', `Date`, NULL))) AS `End`,
PARSE_DATE('%d/%m/%Y', MAX(IF(Action = 'Close', `Date`, NULL))) AS Close
FROM `project.dataset.table`
GROUP BY Person_ID
)
-- ORDER BY Person_ID
有结果
Row Person_ID Middle_Minus_initial End_Minus_initial Close_Minus_initial
1 100 1 7 9
2 150 2 5 8
一种方法是有条件聚集:
select person_id,
date_diff(max(case when action = 'Middle' then date end),
max(case when action = 'Initial' then date end),
day) as middle_minus_initial,
date_diff(max(case when action = 'End' then date end),
max(case when action = 'Initial' then date end),
day) as end_minus_initial,
date_diff(max(case when action = 'Close' then date end),
max(case when action = 'Initial' then date end),
day) as close_minus_initial
from t
group by person_id;
另一种选择,从而避免了使用聚合,是加入几个子查询,如:
SELECT
t.personid,
DATEDIFF(tm.date, ti.date, day) Middle_Minus_initial,
DATEDIFF(te.date, ti.date, day) End_Minus_initial,
DATEDIFF(tc.date, ti.date, day) Close_Minus_initial
FROM
(SELECT DISTINCT personid FROM mytable) t
LEFT JOIN mytable ti ON ti.personid = t.personid AND ti.action = 'Initial'
LEFT JOIN mytable tm ON tm.personid = t.personid AND tm.action = 'Middle'
LEFT JOIN mytable te ON te.personid = t.personid AND te.action = 'End'
LEFT JOIN mytable tc ON tc.personid = t.personid AND tc.action = 'Close'