从列汇总数组值

问题描述 投票:0回答:2

我的工作任务是总结多个阵列的值,我已经达到了我的知识差距。非常感谢这一群体的见解和帮助。

挑战:

我在单个列BigQuery表的每一行中都有一系列域TLD。我想按每个TLD进行分组,并将每个TLD的总计数作为新表返回。

["biz","us","international","eu","com","co","world","us","international","eu","co","biz"]
["com","co","world"]        

响应

**TLD_Name**
biz 2
us 2
international 2
eu 2
com 2
co 3
world 1

在此先感谢您的帮助。

sql google-bigquery
2个回答
2
投票

假设数组列名为tlds,您可以运行以下标准SQL查询:

SELECT
  tld AS TLD_Name,
  COUNT(*) AS count
FROM YourTable
CROSS JOIN UNNEST(tlds) AS tld
GROUP BY tld;

这会产生“扁平化”阵列并获得与每个TLD相关的计数的效果。


1
投票

如果每行中的tld值是高度可重复的并且您有非常多的行 - 下面可能通过首先组合/聚合每行内的tld计数然后总结整个表级(对于BigQuery Standard SQL)来提供一点优化

#standardSQL
WITH `yourproject.yourdataset.yourtable` AS (
  SELECT ["biz","us","international","eu","com","co","world","us","international","eu","co","biz"] tlds UNION ALL
  SELECT ["com","co","world","biz"]   
)
SELECT
  tld_count.tld AS tld,
  SUM(tld_count.cnt) AS cnt
FROM `yourproject.yourdataset.yourtable`,
UNNEST(ARRAY(SELECT AS STRUCT tld, COUNT(*) AS cnt FROM UNNEST(tlds) AS tld GROUP BY tld)) AS tld_count
GROUP BY tld   
© www.soinside.com 2019 - 2024. All rights reserved.