我想在 BigQuery 中垂直连接两个表,这可以使用 UNION ALL 轻松完成,例如:
SELECT
t1.column1,
t1.column2,
t1.column3
FROM
`project.database.table1` as t1
UNION ALL
SELECT
t2.column1,
t2.column2,
t2.column3
FROM
`p21-project.database.table2` as t2
但我想排除 t2 行:
t2.column2=t1.column2 且 t1.column1=X
否则始终追加。
如何实施?
您可以将表 A 与表 B 连接以识别重复的行。
With
testA as (Select x col1, x+1 col2, "X" colX, "tableA" col3 from unnest([1,2,3,4,5]) as x),
testB as (Select x col1, x+20 col2, "Y", "tableB" from unnest([3,4,8,9]) as x),
tableA_filted as (Select col1 from testA where colX="X" group by 1),
tableB as (
Select B.* from testB as B
left join tableA_filted as A
on A.col1=B.col1
where A.col1 is null
)
SELECT * from testA
Union ALL
SELECT * from tableB
由于您知道并非完整的表 A 相关,因此过滤此表并仅保留
tableA_filted
中的相关列。 group by 1
可防止重复输入。然后将其左连接到表 B,并仅保留此连接失败的行。
对于许多场景,
full join
是比 union all
更好的解决方案。