我编写了两个 T-SQL 查询,它们使用 Databricks SQL 在 Databricks 上执行。
第一个查询返回144行结果:
返回 144 行
SELECT DISTINCT
PSTT.TransactionTypeID
FROM basedrd.TransactionType PSTT
LEFT OUTER JOIN basedrd.PMSTransaction ST
ON PSTT.TransactionType = ST.TransactionType
WHERE PSTT.TransactionSource = 'Pershing'
OR (PSTT.TransactionSource = '')
AND (ST.SEDOL = 'CASH'
AND PSTT.IsCashTransaction = 1
OR ST.SEDOL <> 'CASH'
AND PSTT.IsCashTransaction = 0)
第二个查询不返回任何行
未返回任何行
SELECT DISTINCT
PSTT.TransactionTypeID
FROM basedrd.PMSTransaction ST
LEFT OUTER JOIN basedrd.TransactionType PSTT
ON PSTT.TransactionType = ST.TransactionType
AND (PSTT.TransactionSource = 'Pershing'
OR PSTT.TransactionSource = '')
AND (ST.SEDOL = 'CASH'
AND PSTT.IsCashTransaction = 1
OR ST.SEDOL <> 'CASH'
AND PSTT.IsCashTransaction = 0)
奇怪的是,如果我在 SQL Server 上执行相同的代码,我会在第一个查询中返回 145 行
返回 145 行
SELECT DISTINCT
PSTT.TransactionTypeID
FROM dbo.TransactionType PSTT
LEFT OUTER JOIN dbo.PMSTransaction ST
ON PSTT.TransactionType = ST.TransactionType
WHERE PSTT.TransactionSource = 'Pershing'
OR (PSTT.TransactionSource = '')
AND (ST.SEDOL = 'CASH'
AND PSTT.IsCashTransaction = 1
OR ST.SEDOL <> 'CASH'
AND PSTT.IsCashTransaction = 0)
虽然我在 SQL Server 上的 Databricks 上的第二个查询没有返回任何行,但我返回了 46 行。
SELECT DISTINCT
PSTT.TransactionTypeID
FROM dbo.PMSTransaction ST
LEFT OUTER JOIN dbo.TransactionType PSTT
ON PSTT.TransactionType = ST.TransactionType
AND (PSTT.TransactionSource = 'Pershing'
OR PSTT.TransactionSource = '')
AND (ST.SEDOL = 'CASH'
AND PSTT.IsCashTransaction = 1
OR ST.SEDOL <> 'CASH'
AND PSTT.IsCashTransaction = 0)
在Databricks SQL和SQL Server上执行的查询之间的唯一区别是数据库名称(架构)SQL Server是dbo,而databricks是basedrd。
您的数据似乎不匹配。像这样的查询应该有助于确认这些值没有按预期排列:
with data as (
select
case when PSTT.TransactionSource = 'Pershing' then 'Pershing'
when PSTT.TransactionSource = '' then 'Blank'
when ST.TransactionType is not null then
case when PSTT.TransactionType is not null then 'Other' else 'Null' end
else 'Unmatched Row' end as TransactionSource,
case when ST.SEDOL = 'CASH' then 'Cash'
when ST.SEDOL is null then 'Null'
else 'Other' end as SEDOL,
case when PSTT.IsCashTransaction = 0 then '0'
when PSTT.IsCashTransaction = 1 then '1'
when PSTT.IsCashTransaction is null then 'Null'
else 'Other' end
from <SRC>.PMSTransaction ST left outer join <SRC>.TransactionType PSTT
on PSTT.TransactionType = ST.TransactionType
)
select TransactionSource, SEDOL, IsCashTransaction, count(*)
from data
group by TransactionSource, SEDOL, IsCashTransaction
order by TransactionSource, SEDOL, IsCashTransaction;