加入时CTE非常慢

问题描述 投票:0回答:3

我之前发布过类似的内容,但我现在从不同的方向处理这个问题,所以我提出了一个新问题。我希望这没问题。

我一直在与 CTE 合作,该 CTE 根据家长费用创建一笔费用。 SQL和详细信息可以在这里看到:

多键表上的 CTE 索引建议

我认为我没有遗漏 CTE 上的任何内容,但当我将它与大数据表(350 万行)一起使用时,我遇到了问题。

tblChargeShare
包含我需要的一些其他信息,例如
InvoiceID
,因此我将我的 CTE 放置在视图
vwChargeShareSubCharges
中并将其连接到表中。

查询:

Select t.* from vwChargeShareSubCharges t
inner join 
tblChargeShare  s 
on t.CustomerID = s.CustomerID 
and t.MasterChargeID = s.ChargeID 
Where  s.ChargeID = 1291094

在几毫秒内返回结果。

查询:

Select ChargeID from tblChargeShare Where InvoiceID = 1045854

返回 1 行:

1291094

但是查询:

Select t.* from vwChargeShareSubCharges t
inner join 
tblChargeShare  s 
on t.CustomerID = s.CustomerID 
and t.MasterChargeID = s.ChargeID 
Where  InvoiceID = 1045854

运行需要2-3分钟。

我保存了执行计划并将其加载到 SQL Sentry 中。快速查询的树如下所示:

Fast Query

慢查询的计划是:

Slow Query

我尝试过重新索引、通过调优顾问运行查询以及子查询的各种组合。每当连接包含 PK 以外的任何内容时,查询就会很慢。

我在这里也有类似的问题:

SQL Server 查询超时取决于Where 子句

它使用函数而不是 CTE 来对子行进行求和。这是使用 CTE 重写,以尝试避免我现在遇到的相同问题。我已阅读该答案中的回复,但我一无所知 - 我阅读了一些有关提示和参数的信息,但我无法使其工作。我原以为使用 CTE 重写可以解决我的问题。在具有几千行的 tblCharge 上运行时,查询速度很快。

在 SQL 2008 R2 和 SQL 2012 中进行了测试

编辑:

我已将查询压缩为单个语句,但同样的问题仍然存在:

WITH RCTE AS
(
SELECT  ParentChargeId, s.ChargeID, 1 AS Lvl, ISNULL(TotalAmount, 0) as TotalAmount,  ISNULL(s.TaxAmount, 0) as TaxAmount,  
ISNULL(s.DiscountAmount, 0) as DiscountAmount, s.CustomerID, c.ChargeID as MasterChargeID
from tblCharge c inner join tblChargeShare s
on c.ChargeID = s.ChargeID Where s.ChargeShareStatusID < 3 and ParentChargeID is NULL

UNION ALL

SELECT c.ParentChargeID, c.ChargeID, Lvl+1 AS Lvl, ISNULL(s.TotalAmount, 0),  ISNULL(s.TaxAmount, 0),  ISNULL(s.DiscountAmount, 0) , s.CustomerID 
, rc.MasterChargeID 
from tblCharge c inner join tblChargeShare s
on c.ChargeID = s.ChargeID
INNER JOIN RCTE rc ON c.PArentChargeID = rc.ChargeID and s.CustomerID = rc.CustomerID  Where s.ChargeShareStatusID < 3 
)

Select MasterChargeID as ChargeID, rcte.CustomerID, Sum(rcte.TotalAmount) as TotalCharged, Sum(rcte.TaxAmount) as TotalTax, Sum(rcte.DiscountAmount) as TotalDiscount
from RCTE inner join tblChargeShare s on rcte.ChargeID = s.ChargeID and RCTE.CustomerID = s.CustomerID 
Where InvoiceID = 1045854
Group by MasterChargeID, rcte.CustomerID 
GO

编辑: 多玩玩,我只是不明白这个。

此查询是即时的(2ms):

Select t.* from
vwChargeShareSubCharges t
Where  t.MasterChargeID = 1291094

虽然这需要 3 分钟:

DECLARE @ChargeID int = 1291094

Select t.* from
vwChargeShareSubCharges t
Where  t.MasterChargeID = @ChargeID

即使我在“In”中放入大量数字,查询仍然是即时的:

Where  t.MasterChargeID in (1291090, 1291091, 1291092, 1291093,  1291094, 1291095, 1291096, 1291097, 1291098, 1291099, 129109)

编辑2:

我可以使用此示例数据从头开始复制它:

我创建了一些虚拟数据来复制该问题。这并不重要,因为我只添加了 100,000 行,但错误的执行计划仍然发生(在 SQLCMD 模式下运行):

CREATE TABLE [tblChargeTest](
[ChargeID] [int] IDENTITY(1,1) NOT NULL,
[ParentChargeID] [int] NULL,
[TotalAmount] [money] NULL,
[TaxAmount] [money] NULL,
[DiscountAmount] [money] NULL,
[InvoiceID] [int] NULL,
CONSTRAINT [PK_tblChargeTest] PRIMARY KEY CLUSTERED 
(
[ChargeID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,     ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
END
GO

Insert into tblChargeTest
(discountAmount, TotalAmount, TaxAmount)
Select ABS(CHECKSUM(NewId())) % 10, ABS(CHECKSUM(NewId())) % 100, ABS(CHECKSUM(NewId())) % 10
GO 100000

Update tblChargeTest
Set ParentChargeID = (ABS(CHECKSUM(NewId())) % 60000) + 20000
Where ChargeID = (ABS(CHECKSUM(NewId())) % 20000)
GO 5000

CREATE VIEW [vwChargeShareSubCharges] AS
WITH RCTE AS
(
SELECT  ParentChargeId, ChargeID, 1 AS Lvl, ISNULL(TotalAmount, 0) as TotalAmount,   ISNULL(TaxAmount, 0) as TaxAmount,  
ISNULL(DiscountAmount, 0) as DiscountAmount,  ChargeID as MasterChargeID
FROM tblChargeTest Where ParentChargeID is NULL

UNION ALL

SELECT rh.ParentChargeID, rh.ChargeID, Lvl+1 AS Lvl, ISNULL(rh.TotalAmount, 0),    ISNULL(rh.TaxAmount, 0),  ISNULL(rh.DiscountAmount, 0) 
, rc.MasterChargeID 
FROM tblChargeTest rh
INNER JOIN RCTE rc ON rh.PArentChargeID = rc.ChargeID --and rh.CustomerID =  rc.CustomerID 
)

Select MasterChargeID,  ParentChargeID, ChargeID, TotalAmount, TaxAmount, DiscountAmount , Lvl
FROM  RCTE r 
GO

然后运行这两个查询:

--Slow Query:
Declare @ChargeID int = 60900

Select *
from [vwChargeShareSubCharges]
Where MasterChargeID = @ChargeID

--Fast Query:
Select *
from [vwChargeShareSubCharges]
Where MasterChargeID = 60900
sql sql-server-2008-r2 common-table-expression
3个回答
16
投票

SQL Server 在这里可以为您做的最好的事情就是将

ChargeID
上的过滤器向下推入视图内递归 CTE 的锚点部分。这允许搜索找到构建层次结构所需的唯一行。当您提供参数作为常量值时,SQL Server 可以进行优化(对于那些对此类事情感兴趣的人,使用名为
SelOnIterator
的规则):

Pushed predicate with a constant value

当您使用局部变量时,它无法执行此操作,因此

ChargeID
上的谓词会卡在视图之外(从所有
NULL
id 开始构建完整的层次结构):

Stuck Predicate

使用变量时获得最佳计划的一种方法是强制优化器在每次执行时编译新计划。然后,生成的计划将根据执行时变量中的特定值进行定制。这是通过添加

OPTION (RECOMPILE)
查询提示来实现的:

Declare @ChargeID int = 60900;

-- Produces a fast execution plan, at the cost of a compile on every execution
Select *
from [vwChargeShareSubCharges]
Where MasterChargeID = @ChargeID
OPTION (RECOMPILE);

第二个选项是将视图更改为内联表函数。这允许您显式指定过滤谓词的位置:

CREATE FUNCTION [dbo].[udfChargeShareSubCharges]
(
    @ChargeID int
)
RETURNS TABLE AS RETURN
(
  WITH RCTE AS
  (
  SELECT  ParentChargeID, ChargeID, 1 AS Lvl, ISNULL(TotalAmount, 0) as TotalAmount,   ISNULL(TaxAmount, 0) as TaxAmount,  
  ISNULL(DiscountAmount, 0) as DiscountAmount,  ChargeID as MasterChargeID
  FROM tblChargeTest 
  Where ParentChargeID is NULL 
  AND ChargeID = @ChargeID -- Filter placed here explicitly

  UNION ALL

  SELECT rh.ParentChargeID, rh.ChargeID, Lvl+1 AS Lvl, ISNULL(rh.TotalAmount, 0),    ISNULL(rh.TaxAmount, 0),  ISNULL(rh.DiscountAmount, 0) 
  , rc.MasterChargeID 
  FROM tblChargeTest rh
  INNER JOIN RCTE rc ON rh.ParentChargeID = rc.ChargeID --and rh.CustomerID =  rc.CustomerID 
  )

  Select MasterChargeID,  ParentChargeID, ChargeID, TotalAmount, TaxAmount, DiscountAmount , Lvl
  FROM  RCTE r 
)

像这样使用它:

Declare @ChargeID int = 60900

select *
from dbo.udfChargeShareSubCharges(@ChargeID)

查询还可以受益于

ParentChargeID
上的索引。

create index ix_ParentChargeID on tblChargeTest(ParentChargeID)

这是关于类似场景中类似优化规则的另一个答案。 优化包含窗口函数的参数化 T-SQL 查询的执行计划


13
投票

接下来要找到解决方案,我建议 SELECT INTO CTE into e temp table 并从那里加入。根据加入 CTE 的个人经验,我的查询返回时间为 5 分钟,而只需将 CTE 生成的数据插入临时表即可将其缩短至仅 4 秒。我实际上是将两个 CTE 连接在一起,但我想当 CTE 连接到 LONG 表(尤其是外连接)时,这将适用于所有长时间运行的查询。


    --temp tables if needed to work with intermediate values
    If object_id('tempdb..#p') is not null
    drop table #p

    ;WITH cte as ( 
    select * from t1
    )

    select * 
    into #p
    from cte

    --then use the temp table as you would normally use the CTE
    select * from #p

0
投票

使用 CTE 有时会导致编译器出现错误:采用相同的查询并将其从 CTE 移动到子查询使我从“5 分钟后没有一行”变为“不到一秒 100 行” .

© www.soinside.com 2019 - 2024. All rights reserved.