根据解释计划,需要帮助重写此查询,该查询多次使用相同的数据集

问题描述 投票:0回答:1

我们的开发团队运行的查询资源很多,并且在查看解释计划时,看起来它多次使用相同的数据集。无论如何我们可以重写这个查询。

现在,我尝试用直接连接替换共同相关的查询,但是除了一个小的差异之外,多个共同相关的查询看起来仍然相同。

select tb2.mktg_id, mktg_cd , count(distinct tb2.conf_id) 
  from
(select conf_id, count(distinct c.mktg_id) as num_cpg 
   from acc_latst c, off_latst ot 
  where c.mktg_id = ot.mktg_id and c.bus_eff_dt > '2019-01-01' and to_date(strt_tms) = '2019-01-10'  
  group by conf_id 
 having count(distinct c.mktg_id) >1 
)tb1,
(select distinct conf_id, c.mktg_id, mktg_cd 
   from acc_latst c, off_latst ot 
  where c.mktg_id = ot.mktg_id and c.bus_eff_dt > '2019-01-01' and to_date(strt_tms) = '2019-01-10'
)tb2
  where tb1.conf_id = tb2.conf_id group by tb2.mktg_id, mktg_cd 
performance hive yarn query-tuning apache-tez
1个回答
0
投票

一种方法是使用CTE -

with res1 as 
(
select distinct conf_id, c.mktg_id, mktg_cd 
   from acc_latst c, off_latst ot 
  where c.mktg_id = ot.mktg_id and c.bus_eff_dt > '2019-01-01' and to_date(strt_tms) = '2019-01-10'
)
,res2 as
(
select conf_id, count(distinct c.mktg_id) as num_cpg
from res1 group by conf_id having count(distinct c.mktg_id) > 1
)
select res1.mktg_id, mktg_cd, count(distinct res1.conf_id)  from res1 t1 inner join res2 t2 on t1.conf_id=t2.conf_id group by res1.mktg_id, mktg_cd;

如果查询仍然很慢,您是否可以提供表和分区详细信息。

© www.soinside.com 2019 - 2024. All rights reserved.