我有一个非常简单的递归CTE,运行在一个单一的源表(REP.INVENTMOVEMENTS
)包含约4mln条记录。该表的索引量相当大。
with
dataset as (
select imv.sourceBatch,
imv.targetBatch,
imv.sourceDataArea,
imv.targetDataArea,
sum(Weight) as Weight
from REP.INVENTMOVEMENTS imv
where imv.sourceBatch <> ''
Group By imv.sourceBatch,
imv.targetBatch,
imv.sourceDataArea,
imv.targetDataArea
),
result as (
select targetBatch as Batch,
targetDataArea as DataArea,
sourceBatch,
targetBatch,
sourceDataArea,
targetDataArea,
1 as level,
Weight
from dataset
where sourceBatch <> targetBatch
union all
select result.Batch,
result.DataArea,
dataset.sourceBatch,
dataset.targetBatch,
dataset.sourceDataArea,
dataset.targetDataArea,
result.level + 1 as level,
dataset.Weight
from dataset inner join result on dataset.targetBatch = result.sourceBatch
and dataset.targetDataArea = result.sourceDataArea
and dataset.targetBatch <> dataset.sourceBatch
)
select * from result
union all
select targetBatch as Batch,
targetDataArea as DataArea,
sourceBatch,
targetBatch,
sourceDataArea,
targetDataArea,
0 as level,
Weight
from dataset
where sourceBatch = targetBatch
;
运行最初的查询而不进行选择,数据库需要122秒,返回517.947条记录。
运行同样的查询,返回一个批次,需要数据库不到一秒钟的时间,返回5条记录。
但是,如果我在1个批次上运行带有选择的CTE,数据库需要28秒来完成2次递归并返回7条记录。
我需要用150k个批次的结果来填充一个表,所以如果所有的批次都需要半分钟来完成,那么就需要52天来完成这个任务。
这是我的执行计划
只是为了澄清我的目标。批次可以合并成新的批次,所以2个或多个源批次可以创建一个新的批次。在这样的合并中创建的两个批次可以用来创建一个新的批次...等等。
我希望能够选择一个批次,并找到所有用于创建这个新批次的批次。
请考虑到一个批次可以用于多个其他批次。
希望您能在这里帮助我。
我已经通过创建一个内部表并将其填入执行递归查询所需的数据集来解决这个问题。
DECLARE @BatchSequence as table( Batch nvarchar(100),
SourceBatch nvarchar(100),
TargetBatch nvarchar(100),
Weight decimal(18,3));
insert into @BatchSequence
select ReportingBatch, SourceBatch,TargetBatch, SUM(Weight) as Weight
from REP.INVENTMOVEMENTS
WHERE sourceBatch <> ''
Group By ReportingBatch, SourceBatch,TargetBatch;
with
result as (
select targetBatch as Batch,
sourceBatch,
targetBatch,
1 as level,
Weight
from @BatchSequence dataset
where sourceBatch <> targetBatch
union all
select result.Batch,
dataset.sourceBatch,
dataset.targetBatch,
result.level + 1 as level,
dataset.Weight
from @BatchSequence dataset inner join result on dataset.targetBatch = result.sourceBatch
and dataset.targetBatch <> dataset.sourceBatch
)
这将在1分钟内返回25万条记录
希望能帮到别人。