Google BQ 中缺少后链接的记录,但是在更改链接后,获得了上述记录

问题描述 投票:0回答:0

我有一个时间序列类型的事务表。为了获取最新的更新记录,我使用下面提到的查询

SELECT DISTINCT
  statuses.id,
  statuses.transaction_id,
  statuses.created_on,
  statuses.status
  FROM
  `gringotts_dwh.transaction_status` statuses
  INNER JOIN (
    SELECT DISTINCT
      transaction_id,
      MAX(created_on) AS created_on,
    FROM
      `gringotts_dwh.transaction_status`
    GROUP BY
      transaction_id) latest_update
  ON
    statuses.transaction_id = latest_update.transaction_id 
    AND statuses.created_on = latest_update.created_on 
  )

SELECT * FROM transactions

然而,通过这样做,结果中仍然缺少一些记录。他们在下面提到

id 交易编号 created_on 交易状态
11488196 6232804 2023-04-08 11:57:28 UTC 53
11480223 6232245 2023-04-05 01:33:39 UTC 43
11487410 6226866 2023-04-07 09:41:41 UTC 32
11492618 6227333 2023-04-06 22:50:18 UTC 102
11479541 6235787 2023-04-05 11:09:47 UTC

这些记录存在于基表本身中。但是链接后它们不存在。

如果我将查询从 CTE+Subquery 更改为仅子查询,我将在结果中获取这些记录。更新后的查询如下所述

SELECT statuses.transaction_status FROM
(SELECT DISTINCT
  statuses.id,
  statuses.transaction_id,
  statuses.created_on,
  statuses.status AS transaction_status
  FROM
  `gringotts_dwh.transaction_status` statuses
  WHERE
  id in (11492618,11488196,11487410,11480223,11479541)
  ORDER BY
    created_on DESC) statuses
INNER JOIN
  (SELECT DISTINCT
      transaction_id,
      MAX(created_on) AS created_on,
    FROM
      `gringotts_dwh.transaction_status`
    WHERE
      id in (11492618,11488196,11487410,11480223,11479541)
    GROUP BY
      transaction_id
    ORDER BY
    created_on DESC) latest_update
  ON
    statuses.transaction_id = latest_update.transaction_id 
    AND statuses.created_on = latest_update.created_on 

有人可以向我解释一下这种行为吗?

期待中的感谢

sql google-bigquery subquery common-table-expression
© www.soinside.com 2019 - 2024. All rights reserved.