我有一个像这样的BigQuery表:
+------+------------+------------------+
| Name | date | order_id | value |
+------+------------+----------+-------+
| JONES| 2019-01-03 | 11 | 10 |
| JONES| 2019-01-05 | 12 | 5 |
| JONES| 2019-06-03 | 13 | 15 |
| JONES| 2019-07-03 | 14 | 20 |
| John | 2019-07-23 | 15 | 10 |
+------+------------+----------+-------+
并且需要添加具有特定order_id值的列(这意味着我们不知道order_id,只能对其进行计数),例如,order_count为3时。对于一阶解决方案,将像这样并找到最后一个order_id也不是什么大问题:
WITH data AS (
SELECT "JONES" name, DATE("2019-01-03") date, "11" order_id, 10 value
UNION ALL
SELECT "JONES", DATE("2019-01-05"), "12", 5
UNION ALL
SELECT "JONES", DATE("2019-06-03"), "13", 15
UNION ALL
SELECT "JONES", DATE("2019-07-03"), "14", 20
UNION ALL
SELECT "John", DATE("2019-07-23"), "15", 10
)
SELECT name,
ARRAY_AGG(STRUCT(DATE as f_date) ORDER BY order_id LIMIT 1)[OFFSET(0)].*,
max(order_id) AS l_id,
min(order_id) as f_id,
SUM(value) AS total
FROM DATA
GROUP BY name
输出:
+------+------+------------+-----+------+
| name | f_id | f_date |l_id | total|
+------+------+------------+-----+------+
| JONES| 11 | 2019-01-03 | 14 | 50 |
| John | 15 | 2019-07-23 | 15 | 10 |
+------+------+------------+-----+------+
是否有相同的方法来获取最后一个订单的值和计数order_id按名称为3时的值?
输出:
+------+----------+----+----+-----+-----+----------+-----------+
| name |f_date |f_id|l_id|total|3_id | 3_date | l_date |
+------+----------+----+----+-----+-----+----------+-----------+
| JONES|2019-01-03| 11 | 14 | 50 |13 |2019-06-03| 2019-07-03|
| John |2019-07-23| 15 | 15 | 10 |NULL |NULL | 2019-07-23|
+------+----------+----+----+-----+-----+----------+-----------+
我想Case应该有一些东西,但是这里不适用于非数值的聚合:
(CASE WHEN order_id = l_id THEN date ELSE Null END) AS l_date
下面是BigQuery标准SQL的内容>>
WITH data AS ( SELECT "JONES" name, DATE("2019-01-03") `date`, "11" order_id, 10 value UNION ALL SELECT "JONES", DATE("2019-01-05"), "12", 5 UNION ALL SELECT "JONES", DATE("2019-06-03"), "13", 15 UNION ALL SELECT "JONES", DATE("2019-07-03"), "14", 20 UNION ALL SELECT "John", DATE("2019-07-23"), "15", 10 ) SELECT name, ARRAY_AGG(STRUCT(DATE AS f_date) ORDER BY order_id LIMIT 1)[OFFSET(0)].*, MAX(order_id) AS l_id, MIN(order_id) AS f_id, SUM(value) AS total, ARRAY_AGG(STRUCT(order_id AS `_3_order`, `date` AS `_3_date`) ORDER BY order_id LIMIT 3)[SAFE_OFFSET(2)].*, ARRAY_AGG(STRUCT(`date` AS l_date) ORDER BY order_id DESC LIMIT 1)[OFFSET(0)].* FROM DATA t GROUP BY name
有结果
Row name f_date l_id f_id total _3_order _3_date l_date
1 JONES 2019-01-03 14 11 50 13 2019-06-03 2019-07-03
2 John 2019-07-23 15 15 10 null null 2019-07-23