我在 BigQuery 中有一个数据集,其中有一些浮点列,我们称它们为 amount_1、amount_2 等。我还有一个
day
列,它是日期格式的时间戳,每天只有 1 行,这意味着数据已经分组。
对于该月的每一天,我想计算 SUM(amount_1),考虑上个月的最后一天,并回溯 120 天,这意味着,在该天 = '2024-04-05 00 的示例中:00:00 UTC',我想计算日期 '2023-12-03' 和 '2024-03-31' 之间的总和(amount_1),这意味着,对于给定月份内的每一天,总和应该是一样的。
我尝试过这个,但没有成功:
with LastDayPreviousMonth AS (
SELECT
day,
LAST_DAY(DATE_SUB(CAST(timestamp_trunc(day,month) AS DATE), INTERVAL 1 MONTH)) AS last_day_previous_month
FROM
main_table
)
select
day
, SUM(amount_1) OVER(ORDER BY UNIX_SECONDS(cast(last_day_previous_month as timestamp) RANGE BETWEEN 10368000 PRECEDING AND 0 PRECEDING) AS amount_sum
FROM
main_table as M
INNER JOIN LastDayPreviousMonth as L on L.day = M.day
这不起作用,当我尝试调试它时,我要求分钟(天)来知道我正在搜索的窗口,如下所示:
with LastDayPreviousMonth AS (
SELECT
day,
LAST_DAY(DATE_SUB(CAST(timestamp_trunc(day,month) AS DATE), INTERVAL 1 MONTH)) AS last_day_previous_month
FROM
main_table
)
select
day
, SUM(amount_1) OVER(ORDER BY UNIX_SECONDS(cast(last_day_previous_month as timestamp) RANGE BETWEEN 10368000 PRECEDING AND 0 PRECEDING) AS amount_sum
, min(M.day) OVER(ORDER BY UNIX_SECONDS(cast(last_day_previous_month as timestamp) RANGE BETWEEN 10368000 PRECEDING AND 0 PRECEDING) as min_day
FROM
main_table as M
INNER JOIN LastDayPreviousMonth as L on L.day = M.day
我预计 day = '2024-04-05 00:00:00 UTC' 的行的 min_day 值为 '2023-12-03',但它始终是 '2024-01-01 00:00:00四月的每一天都是“UTC”,三月的每一天都是“2023-12-01 00:00:00 UTC”,依此类推,有人能指出我做错了什么吗?
编辑:当我说它“不起作用”时,我的意思是,我正在运行总和的窗口不是我想要的窗口,我在Excel上手动进行了计算以验证这一点,并且我做到了MIN 函数来检查我实际上没有看到正确的窗口
我猜,连接扩大了数据集。请使用
group by 1
每天只有一个条目。
您想回到 120 天,因此您使用了 10368000 秒。
with
main_table as (
SELECT 1 amount_1, * from unnest(generate_date_array("2023-01-01",current_date()) ) as day,unnest([1,2]) as test
),
LastDayPreviousMonth AS (
SELECT
day,
LAST_DAY(DATE_SUB(CAST(timestamp_trunc(day,month) AS DATE), INTERVAL 1 MONTH)) AS last_day_previous_month
FROM
main_table
group by 1 --- to have for each day only one row
)
select
M.day,
last_day_previous_month
, SUM(amount_1) OVER(ORDER BY UNIX_SECONDS(cast(last_day_previous_month as timestamp)) RANGE BETWEEN 10368000 PRECEDING AND 0 PRECEDING) AS amount_sum
, min(M.day) OVER(ORDER BY UNIX_SECONDS(cast(last_day_previous_month as timestamp)) RANGE BETWEEN 10368000 PRECEDING AND 0 PRECEDING) as min_day
FROM
main_table as M
INNER JOIN LastDayPreviousMonth as L on L.day = M.day
不需要加入:
with
main_table as (
SELECT 1 amount_1, * from unnest(generate_date_array("2023-01-01",current_date()) ) as day,unnest([1,2]) as test
),
LastDayPreviousMonth AS (
SELECT
*,#day,
LAST_DAY(DATE_SUB(CAST(timestamp_trunc(day,month) AS DATE), INTERVAL 1 MONTH)) AS last_day_previous_month
FROM
main_table
#group by 1 --- to have for each day only one row
)
select
M.day,
last_day_previous_month
, SUM(amount_1) OVER(ORDER BY UNIX_SECONDS(cast(last_day_previous_month as timestamp)) RANGE BETWEEN 10368000 PRECEDING AND 0 PRECEDING) AS amount_sum
, min(M.day) OVER(ORDER BY UNIX_SECONDS(cast(last_day_previous_month as timestamp)) RANGE BETWEEN 10368000 PRECEDING AND 0 PRECEDING) as min_day
FROM
LastDayPreviousMonth as M