BigQuery中根据自身值重置的累计和

问题描述 投票:0回答:2

我需要计算A列的累计和,并且需要在达到某个阈值后将其重置。在下面的示例中,我正在计算累计和,并在达到 10 或标签更改后将其重置。

标签 价值 累计总和
一个 1 1
一个 2 3
一个 4 7
一个 6 6
一个 3 9
两个 1 1
两个 2 3
两个 1 4

我在 bigquery 中尝试了以下代码 SUM(value) OVER (PARTITION BY label ORDER BY dummy_sequence) as cumulative_sum,

但它没有给出预期的结果。

非常感谢任何帮助

google-bigquery window-functions cumulative-sum
2个回答
0
投票

我认为 Bigquery MOD 函数可以完成这项工作。

类似的东西:

WITH dataset AS (
    SELECT 'One' as Label,  1 as Value, 1 as sequence,
    UNION ALL
    SELECT 'One' as Label, 2 as Value, 2 as sequence,
    UNION ALL
    SELECT 'One' as Label, 4 as Value, 3 as sequence,
    UNION ALL
    SELECT 'One' as Label, 6 as Value, 4 as sequence,
    UNION ALL
    SELECT 'One' as Label, 3 as Value, 5 as sequence,
    UNION ALL

    SELECT 'Two' as Label,  1 as Value, 1 as sequence,
    UNION ALL
    SELECT 'Two' as Label,  2 as Value, 2 as sequence,
    UNION ALL
    SELECT 'Two' as Label,  1 as Value, 3 as sequence,
)

SELECT
  Label,
  Value,
  MOD(SUM(value) OVER (PARTITION BY label ORDER BY sequence),10) as cumulative_sum,
FROM dataset

给出适当的结果。


0
投票

您可能想使用

RECURSIVE
有条件地累加值。

查询
CREATE TEMP TABLE sample_data AS (
    WITH
    _sample_data AS (
        SELECT 'One' as Label, 1 as Value, 1 as expected_cumulative_sum,
        UNION ALL SELECT 'One', 2, 3,
        UNION ALL SELECT 'One', 4, 7,
        UNION ALL SELECT 'One', 6, 6,
        UNION ALL SELECT 'One', 3, 9,
        UNION ALL SELECT 'Two', 1, 1,
        UNION ALL SELECT 'Two', 2, 3,
        UNION ALL SELECT 'Two', 1, 4,
        UNION ALL SELECT 'Three', 5, 5,
        UNION ALL SELECT 'Three', 4, 9,
        UNION ALL SELECT 'Three', 8, 8,
        UNION ALL SELECT 'Three', 7, 7,
        UNION ALL SELECT 'Three', 5, 5,
        UNION ALL SELECT 'Three', 4, 9,
    )
    SELECT *, ROW_NUMBER() OVER (PARTITION BY Label) as row_num,
    FROM _sample_data
);

WITH
RECURSIVE calculate_cumulative_sum AS (
    SELECT label, value, row_num, value AS cumulative_sum
    FROM sample_data
    WHERE row_num = 1

    UNION ALL

    SELECT
        s.label, s.value, s.row_num,
        IF(
            -- may want to decide between '>' and '>='
            s.value + c.cumulative_sum >= 10,
            s.value,
            s.value + c.cumulative_sum
        ) AS cumulative_sum,
    FROM sample_data AS s
    INNER JOIN calculate_cumulative_sum AS c
        ON s.label = c.label AND s.row_num = c.row_num + 1
)
SELECT label, row_num, value, cumulative_sum
FROM calculate_cumulative_sum
ORDER BY label, row_num
;
结果

© www.soinside.com 2019 - 2024. All rights reserved.