由于174 UNION ALL语句,SQL Server查询SSIS转换超时

问题描述 投票:0回答:2

我在Hive和SQL Server中有一个表,其数据存储如下。我正在使用SSIS将此数据移入SQL Server。查询花费的时间太长。 “描述”列中大约有175个单独的值,这导致174个UNION ALL语句,由于该原因,查询在大约2小时后超时。

SQL错误[08S01]:org.apache.thrift.transport.TTransportException:java.net.SocketTimeoutException:读取超时*

是否有更好的方法编写此查询?

谢谢!

Hive:

ID  | Description
----+------------------------------
 1  | Desc1;Desc2;Desc3;Desc4
 2  | Desc1;Desc3;Desc4;Desc5;Desc6
 ...
230 | Desc8;Desc163;Desc9;Desc2;Desc172

SQL Server:

CaseID | GroupID | Description
-------+---------+--------------
   1   |    63   | Desc1
   1   |    44   | Desc2
   1   |    57   | Desc3
   1   |    78   | Desc4
   ...
   2   |    78   | Desc1
   2   |    57   | Desc3

查询:

select 
       case 
             when cas.description like '%Desc1%' then 63 
       end as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from 
       svc_case cas
inner join account acc on acc.id = cas.id
where cas.description <> 'NULL' and LENGTH(cas.description) > 0
and acc.recordid = '03443FGT'
union all 
select 
       case 
             when cas.description like '%Desc2%' then 44
       end as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from 
       svc_case cas
inner join account acc on acc.id = cas.id
where cas.description <> 'NULL' and LENGTH(cas.description) > 0
and acc.recordid = '03443FGT'
union all
select 
       case 
             when cas.description like '%Desc3%' then 57 
       end as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from 
       svc_case cas
inner join account acc on acc.id = cas.id
where cas.description <> 'NULL' and LENGTH(cas.description) > 0
and acc.recordid = '03443FGT'
union all
select 
       case 
             when cas.description like '%Desc4%' then 78 
       end as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from 
       svc_case cas
inner join account acc on acc.id = cas.id
where cas.description <> 'NULL' and LENGTH(cas.description) > 0
and acc.recordid = '03443FGT'
...
select 
       case 
             when cas.description like '%Desc175%' then 12 
       end as groupid, -- maps to groupid
       cas.id as caseid, -- maps to caseid 
       current_timestamp as INSERT_DT
from 
       svc_case cas
inner join account acc on acc.id = cas.id
where cas.description <> 'NULL' and LENGTH(cas.description) > 0
and acc.recordid = '03443FGT'
sql sql-server hive ssis union-all
2个回答
0
投票

仅运行查询一次。因此,不要全部合并,并省略案例。使用多播并将其在SSIS中拆分。


0
投票

这是黑暗中的一击,但是您可以做2件事来改善此查询。首先,让我们解决所有这些UNION ALL如果我正确理解了您的查询,则可以取消数据透视以实现相同的目的:

SELECT V.groupid,
       cas.id AS caseid,
       current_timestamp as INSERT_DT
FROM dbo.svc_case cas
     JOIN dbo.account acc on acc.id = cas.id
     CROSS APPLY (VALUES(CASE WHEN cas.description LIKE '%Desc1%' THEN 63 END),
                        (CASE WHEN cas.description LIKE '%Desc2%' THEN 44 END),
                        (CASE WHEN cas.description LIKE '%Desc3%' THEN 57 END),
                        (CASE WHEN cas.description LIKE '%Desc4%' THEN 78 END),
                        --I assume there are 174 more of these
                        (CASE WHEN cas.description LIKE '%Desc178%' THEN 1 END))V(groupid) --The last one isn't correct, but to show how the `APPLY` ends

然后您有了WHERE,由于LENGTH,因此无法保存。 LENGTH实际上不是T-SQL运算符,所以我希望您实际上正在使用SQL Server(如果不是,则浪费了答案,因为上面是特定于T-SQL的) 。考虑到LEN(NULL)返回NULL,请使用<> ''。考虑到您已经具有<> 'NULL',尽管可以使用NOT IN

WHERE cas.description NOT IN('NULL','')
  AND acc.recordid = '03443FGT'

但是,我建议不要将文字字符串值'NULL'存储在您的列中,您应该解决该问题并实际存储NULL而不是'NULL'; 2是不同的值,并且非常的行为有所不同。

© www.soinside.com 2019 - 2024. All rights reserved.