在雪花中我有以下查询
With CTE as(
select account_id -- sloppy unvalidated field with text
from mytable
where try_to_numeric(account_id) is not null
)
select *
from
accounts
inner join CTE on accounts.account_id = CTE.account_id
--- 结果 -- 错误 -- CTE 中的“我的狗很酷”不是数字!
连接是否发生在 try_to_numeric 过滤器之前?看起来是这么回事!是否有一个查询计划可以肯定地表明这一点?
这个问题是“已知的”历史优化。您现在已经学会了。
解决方案是在 CTE 中转换选择值,
with cte as (
select
/* sloppy unvalidated field with text */
try_to_numeric(account_id) as account_id_safe
from mytable
where account_id_safe is not null
)
select *
from accounts as a
join CTE
on a.account_id = cte.account_id_safe
这告诉编译器您只需要“更改的值”,否则它会看到使用了 OG 值,因此在过滤之前使用它,因为“结果”应该是相同的......尽管它不安全。
如果仍然爆炸,你可以将 WHERE 更改为 QUALIFY,但我怀疑编译器此时会以相同的方式对待它,对于这个玩具 SQL,你可能是:
select
*
from accounts as a
where a.account_id in (select try_to_numeric(account_id) from mytable)
或者如果假设
mytable
上没有重复项
select
a.*
from accounts as a
join mytable as mt
on a.account_id = try_to_numeric(mt.account_id)
因为
NULL = NULL
是假的