使用窗口函数获取第一个值以及按总和分组

问题描述 投票:0回答:1

我想获取每个分组的“顶行”数据,以及跨越整个分组的聚合指标。

下面是一个具体的例子,我使用连接来解决我的问题。

示例数据:

create or replace table TABLE_ID
(
fruit string,
store string,
state string,
cost numeric
);

insert into TABLE_ID
values
('apple', 'Whole Foods', 'CA', 28.3),
('apple', 'Walmart', 'UT', 3.2),
('apple', 'Whole Foods', 'AZ', 4.4),
('apple', 'Walmart', 'NY', 5.1),
('banana', 'Whole Foods', 'CO', 2.3),
('banana', 'Whole Foods', 'AZ', 28.8),
('banana', 'Walmart', 'NY', 93.3),
('banana', 'Whole Foods', 'NY', 20.1);

解决方案:

select b.*, a.total_cost
from (
  select
  fruit, sum(cost) as total_cost
  from TABLE_ID
  group by fruit
) a
left join
(
  select fruit, store as top_purchase_store, state as top_purchase_state
  from TABLE_ID
  qualify row_number() over (partition by fruit order by cost desc) = 1
) b
on a.fruit = b.fruit
;

输出:

total_cost  fruit   top_purchase_store  top_purchase_state
41          apple   Whole Foods         CA
144.5       banana  Walmart             NY

我觉得应该可以在不使用连接的情况下做到这一点。但是,我无法根据需要将 first_value 与 sum 聚合结合起来。

您有其他建议吗?

sql google-bigquery window-functions
1个回答
0
投票

您也可以按 sum() 聚合进行分区。我确认这在 BigQuery 中有效。

select total_cost, fruit, top_purchase_store, top_purchase_state
from (
  select fruit, store as top_purchase_store, state as top_purchase_state, 
    row_number() over (partition by fruit order by cost desc) as rn, 
    sum(cost) over (partition by fruit) as total_cost
  from TABLE_ID
  )z
where rn = 1;
总成本 水果 top_purchase_store top_purchase_state
41.0 苹果 全食 CA
144.5 香蕉 沃尔玛 纽约

© www.soinside.com 2019 - 2024. All rights reserved.