在bigquery中使用RANGE_BUCKET时如何显示存储桶名称

问题描述 投票:3回答:2

这是我在BigQuery中使用公共数据集的查询:

SELECT RANGE_BUCKET(reputation, [400000, 500000, 600000, 700000, 800000, 900000, 1000000, 1100000, 1200000]) AS reputation_group, COUNT(*) AS count
FROM `bigquery-public-data.stackoverflow.users`
Where reputation > 200000
GROUP BY 1
ORDER By 1

结果如下:

enter image description here

而不是将信誉组显示为整数,我如何显示存储桶的范围:

0: [0-400000]
1: [400001-500000]
2: [500001-600000]
....

非常感谢。

UPDATE:非常感谢米哈伊尔(Mikhail)的回答,并在下面做了一些小改动:

SELECT bucket, 
  FORMAT('%i - %i', IFNULL(ranges[SAFE_OFFSET(bucket - 1)] + 1, 0), ranges[SAFE_OFFSET(bucket)]) AS reputation_group, 
  COUNT(*) AS COUNT
FROM `bigquery-public-data.stackoverflow.users`,
UNNEST([STRUCT([200000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 1100000, 1200000] AS ranges)]),
UNNEST([RANGE_BUCKET(reputation, ranges)]) bucket 
WHERE reputation > 200000
GROUP BY 1, 2
ORDER BY bucket 

请注意,将额外的200000项添加到STRUCT,结果显示为200001 - 400000代替0 - 400000

google-bigquery bucket
2个回答
1
投票

下面是BigQuery标准SQL的内容>>

#standardSQL
SELECT bucket, 
  FORMAT('%i - %i', IFNULL(ranges[SAFE_OFFSET(bucket - 1)] + 1, 0), ranges[SAFE_OFFSET(bucket)]) AS reputation_group, 
  COUNT(*) AS COUNT
FROM `bigquery-public-data.stackoverflow.users`,
UNNEST([STRUCT([400000, 500000, 600000, 700000, 800000, 900000, 1000000, 1100000, 1200000] AS ranges)]),
UNNEST([RANGE_BUCKET(reputation, ranges)]) bucket 
WHERE reputation > 200000
GROUP BY 1, 2
ORDER BY bucket  


0
投票

带有JOIN和一些重构:


0
投票

非常感谢@Mikhail Berlyant和@Felipe Hoffa,这是Mikhail脚本中的一个很小的补充:

© www.soinside.com 2019 - 2024. All rights reserved.