在bigquery中查询多个数据集中的表时出现问题。

问题描述 投票:0回答:1

我试图从两个不同的数据集中查询两个bigquery表,以获得两个独立的列。我已经尝试了联合和连接,但它们都没有给我想要的。以下是我试过的查询

with abagrowth as (
SELECT
  session abas,
  term abat,
  COUNT(distinct studentid) AS acount,
  ROUND(100 * (COUNT(distinct studentid) - LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session)) / LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session),0) || '%' AS agrowth
FROM
  aba.abaresult
GROUP BY
  1,
  2
ORDER BY
  1,
  2),

bidagrowth as (
SELECT
  session bidas,
  term bidat,
  COUNT(distinct studentid) AS bcount,
  ROUND(100 * (COUNT(distinct studentid) - LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session)) / LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session),0) || '%' AS bgrowth
FROM
  bida.bidaresult
GROUP BY
  1,
  2
ORDER BY
  1,
  2)

select abas, agrowth from abagrowth
union all
select bidas, bgrowth from bidagrowth

数据集与此类似

name  subject  session      totalscore
-------------------------------------------
jack  maths    2013/2014         70
jane  maths    2013/2014         65
jill  maths    2013/2014         80
jack  maths    2014/2015         72
jack  eng      2014/2015         87
jane  science  2014/2015         67
jill  maths    2014/2015         70
jerry eng      2014/2015         70
jaasp science  2014/2015         85

我想得到的表格是这样的格式或类似的格式。

session    agrowth  bgrowth
2013/2014   null     null
2014/2015   10%       11%
2015/2016   5%        2%

以上数据是为举例而假设的。

疑问

  1. 这在bigquery中可以实现吗?

  2. 如果可以,如何实现?

谅谅

google-bigquery analytics
1个回答
1
投票

关于数据集。是的,你可以查询两个数据集。检查出 本回答. 基本上,你只需要指明你所使用的项目(可选)、数据集和表。

对于你想得到的数据。你可以使用JOIN而不是UNION来实现。按session对表进行JOIN,可以让你每个session有一条记录。然后你可以选择在你的SELECT中包含哪些列。

WITH abagrowth AS (
SELECT
  session,
  term abat,
  COUNT(distinct studentid) AS acount,
  ROUND(100 * (COUNT(distinct studentid) - LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session)) / LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session),0) || '%' AS agrowth
FROM
  aba.abaresult
GROUP BY
  1,
  2
ORDER BY
  1,
  2),

bidagrowth AS (
SELECT
  session,
  term bidat,
  COUNT(distinct studentid) AS bcount,
  ROUND(100 * (COUNT(distinct studentid) - LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session)) / LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session),0) || '%' AS bgrowth
FROM
  bida.bidaresult
GROUP BY
  1,
  2
ORDER BY
  1,
  2)

SELECT aba.session, aba.agrowth, bida.bgrowth
   FROM abagrowth aba
   JOIN bidagrowth bida
        ON aba.session = bida.session

UNION将堆叠两个查询的结果。

© www.soinside.com 2019 - 2024. All rights reserved.