我正在将数百万 (1-10) 行合并到一个包含 20 多百万行的表中。
目标表有 16 列,1 PK 列(NormalID)。
我发现的问题是某些记录的 PK 列 ID 发生了更改。因此,例如记录“A”,PK id 为“12345678”,其 id 将更改为“4567890”。这就是为什么我在合并中添加了删除。我当时将数据来源限制为 2 个月。
下面的表格脚本:
CREATE TABLE [reporting].[slowmergetbl](
[sEntity] [varchar](50) NOT NULL,
[wYear] [smallint] NOT NULL,
[wPeriod] [smallint] NOT NULL,
[sAccount] [varchar](50) NOT NULL,
[wBracket] [smallint] NOT NULL,
[sCurrency] [varchar](3) NULL,
[dValue] [float] NULL,
[bDirty] [bit] NULL DEFAULT 1,
[dFactValue] [float] NULL,
[wEntityId] [int] NULL,
[wAccountId] [int] NULL,
[wTimeId] [int] NULL,
[wExtDimId1] [int] NOT NULL DEFAULT 0,
[NormalID] [int] NOT NULL,
[ModifiedDate] [datetime] NULL,
[ModifyType] [varchar](10) NULL,
CONSTRAINT [PK_NormalID] PRIMARY KEY CLUSTERED ([NormalID] ASC),
INDEX [IDX_Year] NONCLUSTERED ([wYear] ASC),
INDEX [IDX_Period] NONCLUSTERED ([wPeriod] ASC),
INDEX [IX_ReportQuery] NONCLUSTERED
(
[wYear],
[dFactValue]
)
INCLUDE ([sEntity],[wPeriod],[sAccount],[wBracket],[wAccountId]),
INDEX [IX_ReportQuery2] NONCLUSTERED
(
[sAccount],
[wBracket])
INCLUDE ([sEntity],[wYear],[wPeriod],[dValue],[wEntityId]),
INDEX [IDX_Normal_ModifyType] NONCLUSTERED ([ModifyType] ASC)
);
合并语句:
merge reporting.slowmergetbl as a
using dbo.slowmergetbl as b on a.normalid = b.normalid
WHEN NOT MATCHED BY SOURCE and a.wyear in (
2024
) and a.wPeriod in (9,10 )
THEN UPDATE SET
a.modifieddate=getdate(),
a.modifytype='DELETED'
when not matched by target
then insert (
[sEntity],
[wYear],
[wPeriod],
[sAccount],
[wBracket],
[sCurrency],
[dValue],
[bDirty],
[dFactValue],
[wEntityId],
[wAccountId],
[wTimeId],
[wExtDimId1],
[NormalID],
[ModifiedDate],
[ModifyType]
)
values (
b.[sEntity],
b.[wYear],
b.[wPeriod],
b.[sAccount],
b.[wBracket],
b.[sCurrency],
b.[dValue],
b.[bDirty],
b.[dFactValue],
b.[wEntityId],
b.[wAccountId],
b.[wTimeId],
b.[wExtDimId1],
b.[NormalID],
getdate(),'INSERT')
when matched
and (
isnull(a.[sAccount],'')<>isnull(b.[sAccount],'')
or isnull(a.[sEntity],'')<>isnull(b.[sEntity],'')
or isnull(a.[wYear],0)<>isnull(b.[wYear],0)
or isnull(a.[wPeriod],0)<>isnull(b.[wPeriod],0)
or isnull(a.[wBracket],0)<>isnull(b.[wBracket],0)
or isnull(a.[wEntityId],0)<>isnull(b.[wEntityId],0)
or isnull(a.[wAccountId],0)<>isnull(b.[wAccountId],0)
or isnull(a.[sCurrency],'')<>isnull(b.[sCurrency],'')
or isnull(cast(a.[dValue] as float),0.00)<>isnull(cast(b.[dValue] as float),0.00)
or isnull(a.[bDirty],0)<>isnull(b.[bDirty],0)
or isnull(cast(a.[dFactValue] as float),0.00)<>isnull(cast(b.[dFactValue] as float),0.00)
or isnull(a.[wTimeId],0)<>isnull(b.[wTimeId],0)
or isnull(a.[wExtDimId1],0)<>isnull(b.[wExtDimId1],0)
)
then update set
a.[sAccount]=b.[sAccount],
a.[sEntity]=b.[sEntity],
a.[wYear]=b.[wYear],
a.[wPeriod]=b.[wPeriod],
a.[wBracket]=b.[wBracket],
a.[wEntityId]=b.[wEntityId],
a.[wAccountId]=b.[wAccountId],
a.[sCurrency]=b.[sCurrency],
a.[dValue]=b.[dValue],
a.[bDirty]=b.[bDirty],
a.[dFactValue]=b.[dFactValue],
a.[wTimeId]=b.[wTimeId],
a.[wExtDimId1]=b.[wExtDimId1],
a.modifieddate=getdate(),
a.modifytype='UPDATE';
似乎运行得很慢,有时需要几个小时。
关于如何改进并加快合并速度有什么建议吗?
谢谢。
一些建议:
检查您是否有索引
reporting.slowmergetbl.NormalID
,
有时将 MERGE 拆分为 3 个独立的运算符(INSERT/UPDATE/UPDATE)会有所帮助,至少你会知道哪一部分导致了主要延迟,并可以详细分析其执行计划,
检查是否需要更新(“匹配时”部分)的逻辑太复杂。有时这样的逻辑会导致执行计划中出现嵌套循环,并且性能非常差。考虑在两个表中添加“哈希”列,根据您比较的所有字段(sAccount,sEntity,...)计算它,它应该看起来像
HASHBYTES('SHA2_256', CONCAT(sAccount,'|',sEntity,'|',...))
并仅使用此字段来比较数据以确定是否需要待更新。