我想在每日、每周和每月的汇总中包含零计数。
我的日期表如下:
date_range AS (
SELECT DATE_SUB(CURRENT_DATE(), INTERVAL OFFSET DAY) AS date
FROM UNNEST(GENERATE_ARRAY(0, 27)) AS OFFSET
)
我的每日聚合代码是:
day_cte AS (
SELECT
date_range.date AS event_date,
userbase.user_pseudo_id,
COUNT(events.event_date) AS num_of_sessions
FROM
date_range
CROSS JOIN
userbase
LEFT JOIN
`app.analytics_317927526.events_*` AS events
ON
DATE(PARSE_DATE('%Y%m%d', events.event_date)) = date_range.date
AND events.event_name = 'session_start'
AND events.user_pseudo_id = userbase.user_pseudo_id
GROUP BY
date_range.date, userbase.user_pseudo_id
)
我的每周聚合代码是:
week_cte as
(
select userbase.user_pseudo_id,DATE_TRUNC(date_range.date, week) as event_week ,count(*) as num_of_sessions
FROM
date_range
CROSS JOIN
userbase
LEFT JOIN
`app.analytics_317927526.events_*` AS events
ON
DATE(PARSE_DATE('%Y%m%d', events.event_date)) = date_range.date
AND events.event_name = 'session_start'
AND events.user_pseudo_id = userbase.user_pseudo_id
group by 1,2
),
我的每月聚合代码是:
month_cte as
(
select userbase.user_pseudo_id,DATE_TRUNC(date_range.date, month) as event_month ,count(*) as num_of_sessions
FROM
date_range
CROSS JOIN
userbase
LEFT JOIN
`app.analytics_317927526.events_*` AS events
ON
DATE(PARSE_DATE('%Y%m%d', events.event_date)) = date_range.date
AND events.event_name = 'session_start'
AND events.user_pseudo_id = userbase.user_pseudo_id
group by 1,2
),
我只是想确认我这样做是正确的,因为每周和每月的汇总似乎会产生意想不到的结果。
每日结果似乎是合理的。
您的每周和每月聚合代码未正确处理零计数。问题在于事件表的 LEFT JOIN。当特定日期没有事件时,LEFT JOIN 仍会为事件表中的列生成具有 NULL 值的行。但是,当您按周或月聚合时,您会截断日期,这会导致 date_range.date 的值与事件表匹配的值不同。
要在每周和每月聚合中正确处理零计数,您需要为每周和每月创建完整的日期范围,然后与事件表进行左连接。以下是调整每周和每月聚合查询的方法:
#sql
week_cte AS (
SELECT
userbase.user_pseudo_id,
DATE_TRUNC(date_range.date, week) AS event_week,
COUNT(events.event_date) AS num_of_sessions
FROM
(SELECT DATE_SUB(CURRENT_DATE(), INTERVAL OFFSET DAY) AS date FROM UNNEST(GENERATE_ARRAY(0, 27)) AS OFFSET) AS date_range
CROSS JOIN
userbase
LEFT JOIN
(SELECT DATE_TRUNC(PARSE_DATE('%Y%m%d', event_date), week) AS event_week, event_name, user_pseudo_id
FROM `app.analytics_317927526.events_*`
WHERE event_name = 'session_start') AS events
ON
events.event_week = DATE_TRUNC(date_range.date, week)
AND events.user_pseudo_id = userbase.user_pseudo_id
GROUP BY
userbase.user_pseudo_id, event_week
),
month_cte AS (
SELECT
userbase.user_pseudo_id,
DATE_TRUNC(date_range.date, month) AS event_month,
COUNT(events.event_date) AS num_of_sessions
FROM
(SELECT DATE_SUB(CURRENT_DATE(), INTERVAL OFFSET DAY) AS date FROM UNNEST(GENERATE_ARRAY(0, 27)) AS OFFSET) AS date_range
CROSS JOIN
userbase
LEFT JOIN
(SELECT DATE_TRUNC(PARSE_DATE('%Y%m%d', event_date), month) AS event_month, event_name, user_pseudo_id
FROM `app.analytics_317927526.events_*`
WHERE event_name = 'session_start') AS events
ON
events.event_month = DATE_TRUNC(date_range.date, month)
AND events.user_pseudo_id = userbase.user_pseudo_id
GROUP BY
userbase.user_pseudo_id, event_month
)
这些调整应确保您的每周和每月汇总正确包含零计数。