如何使用 Bigquery 取消嵌套时间并对它们进行分类?

问题描述 投票:0回答:1

我正在尝试从 end_date_time 和 start_date_time 生成一个数组,并从该数组中提取小时,例如。 2024-03-09 12:00:18.000000 UTC (start_date_time) 和 2024-03-09 15:00:18.000000 UTC (end_date_time) 小时应为 12,13,14,15。

(也会有不同的段和event_type)

现在我想将这些时间分组并计算它们。这是我的示例数据:

Sample data

我的输出应如下所示: Output from sample data

我尝试了以下查询,但没有得到想要的结果。

with q1 as (
Select segment, event_type,hours from
`id.dataset.my_tab`,
unnest(generate_timestamp_array(end_date_time,start_date_time, interval 1 hour)) as hours
),
q2 as (
select segment, event_type,
EXTRACT(HOUR FROM hours) as hours_category from q1
)
Select segment, event_type, hours_category,
count(hour_category) as count_hours
from q2
Group by hour_category, event_type,segment
google-bigquery unnest
1个回答
0
投票
WITH data AS (
  SELECT 'BB00-57C9522F14A9' AS segment, 'Depo' AS event_type, TIMESTAMP '2024-03-09 15:00:18.000000 UTC' AS end_date_time, TIMESTAMP '2024-03-09 12:00:18.000000 UTC' AS start_date_time UNION ALL
  SELECT 'BB00-57C9522F14A9', 'Depo', '2024-03-09 17:07:35.000000 UTC', '2024-03-09 12:00:18.000000 UTC' UNION ALL
  SELECT 'BB00-57C9522F14A9', 'Depo', '2024-03-06 12:20:28.000000 UTC', '2024-03-06 06:10:11.000000 UTC'
),
GeneratedHours AS (
  SELECT
    segment,
    event_type,
    ARRAY(
      SELECT AS STRUCT EXTRACT(HOUR FROM TIMESTAMP_ADD(start_date_time, INTERVAL hour HOUR)) AS hour
      FROM UNNEST(GENERATE_ARRAY(0, TIMESTAMP_DIFF(end_date_time, start_date_time, HOUR))) AS hour
    ) AS hours
  FROM data
)

SELECT
  segment,
  event_type,
  hour AS hours_category,
  COUNT(*) AS count_hours
FROM (  SELECT
    segment,
    event_type,
    hour.hour AS hour
  FROM GeneratedHours
  CROSS JOIN UNNEST(hours) AS hour)
GROUP BY
  segment,
  event_type,
  hours_category

© www.soinside.com 2019 - 2024. All rights reserved.