变体列聚合

问题描述 投票:0回答:1

在源中,我们有一个变体列(事件),我们需要总结该列中的指标(通话、展示次数、潜在客户、访问次数等)。

这里棘手的部分是变体列可以随时添加新指标,我们不想每次都更新我们的代码(存储过程)。

有没有一种方法可以动态地将变体列(事件)中的指标添加到目标变体列(total_events),而无需更改任何代码? 来源:

所需目标:

用于设置源数据的SQL

CREATE OR REPLACE TABLE user_activity (
    ACTIVITY DATE,
    ID       NUMBER  , 
    NAME     VARCHAR , 
    EVENTS   VARIANT
);

INSERT INTO user_activity (activity, id, name, events) 
SELECT '2023-09-05'::date, 101, 'Mickey Mouse', PARSE_JSON('{   "imps": 1,   "visits": 1,   "calls": 0,   "leads": 0 }');

INSERT INTO user_activity (activity, id, name, events) 
SELECT '2023-09-05'::date, 102, 'Mickey Mouse', PARSE_JSON('{   "imps": 1,   "visits": 1,   "calls": 0,   "leads": 0 }');

INSERT INTO user_activity (activity, id, name, events)
SELECT '2023-09-05'::date, 103, 'Mickey Mouse', PARSE_JSON('{   "imps": 1,   "visits": 0,   "calls": 1,   "leads": 0 }');

INSERT INTO user_activity (activity, id, name, events)
SELECT '2023-09-05'::date, 104, 'Mickey Mouse', PARSE_JSON('{   "imps": 1,   "visits": 0,   "calls": 0,   "leads": 1 }');

INSERT INTO user_activity (activity, id, name, events)
SELECT '2023-09-06'::date, 105, 'Mickey Mouse', PARSE_JSON('{   "imps": 1,   "visits": 0,   "calls": 0,   "leads": 1,   "service": 1 }');

INSERT INTO user_activity (activity, id, name, events)
SELECT '2023-09-06'::date, 106, 'Mickey Mouse', PARSE_JSON('{   "imps": 1,   "visits": 1,   "calls": 0 }');

SELECT * FROM user_activity ORDER BY 1,2;

这是正在进行的工作(来自另一篇文章,对重复条目表示歉意):

select 
    activity     as activity
  , name         as name
  , OBJECT_CONSTRUCT( 
                  'calls',   sum(a.calls::NUMBER(2,1)),
                  'imps',    sum(a.imps::NUMBER(2,1)),
                  'leads',   sum(a.leads::NUMBER(2,1)),
                  'visits',  sum(a.visits::NUMBER(2,1)),
                  'service', sum(a.service::NUMBER(2,1))
                  ) AS total_events
from (
    SELECT
        activity       AS activity
      , name           AS name
      , events         AS events 
      , events:calls   AS calls
      , events:imps    AS imps
      , events:leads   AS leads
      , events:visits  AS visits
      , events:service AS service
    FROM user_activity,
    LATERAL FLATTEN(events, OUTER=> TRUE) f
    GROUP BY 1,2,3
) a
group by 1,2
order by 1,2,3;

我想要像这样的 SQL(虚拟 sql),我不必列出嵌入在变体列中的指标。

SELECT activity, SUM(events) AS total_events
FROM user_activity
GROUP BY activity;
sql snowflake-cloud-data-platform aggregate variant
1个回答
0
投票

这可能会给你你想要的:

with flat_data as (
SELECT
        activity       AS activity
      , name           AS name
      ,f.key
      ,f.value
    FROM user_activity,
    LATERAL FLATTEN(events, OUTER=> TRUE) f
)
, sum_data as (
SELECT
    activity
    ,name
    ,key
    ,sum(value)::INTEGER as value
FROM
    flat_data
GROUP BY
    activity, name, key
)
SELECT
    activity
    ,name
    ,OBJECT_AGG(key, value) TOTAL_EVENTS
from sum_data
group by activity, name
;
© www.soinside.com 2019 - 2024. All rights reserved.