优化T-SQL报告性能

问题描述 投票:0回答:2

我有下表,我需要根据PerCode值删除两个日期之间的相反行。事实上,我们删除日期范围内具有相同PerCode且具有相等和相反值的行。

问题是用户在报告时将开始日期和结束日期作为参数提供,但如果我尝试在运行时删除这些,则查询会花费太多时间。

例:

开始日期= 01/01/2018结束日期= 31/12/2018

我应该删除第3行和第4行。你是否知道如何在优化性能的同时做到这一点(该表有200万行)

+----+------------+---------+---------+-----------+
| Id |    Date    | PerCode |  Value  | IsDeleted |
+----+------------+---------+---------+-----------+
|  1 | 01/10/2017 | C1      |    10   |           |
|  2 | 01/01/2018 | C1      |   -10   |           |
|  3 | 15/02/2018 | C2      |    20   |    1      |
|  4 | 10/03/2018 | C2      |   -20   |    1      |
|  5 | 01/12/2018 | C3      |    15   |           |
|  6 | 01/02/2019 | C3      |   -15   |           |
+----+------------+---------+---------------------+
tsql sql-server-2012 query-performance
2个回答
0
投票

我快速了解一下,使用表变量允许我使用您的测试数据将查询拼凑在一起。但是,当使用超过200万行时,这可能表现不佳?

DECLARE @table TABLE (id INT, [date] DATE, percode CHAR(2), [value] INT, isdeleted BIT);
INSERT INTO @table
SELECT 1, '20171001', 'C1', 10, NULL
UNION ALL
SELECT 2, '20180101', 'C1', -10, NULL
UNION ALL
SELECT 3, '20180215', 'C2', 20, NULL
UNION ALL
SELECT 4, '20180310', 'C2', -20, NULL
UNION ALL
SELECT 5, '20181201', 'C3', 15, NULL
UNION ALL
SELECT 6, '20190201', 'C3', -15, NULL;

DECLARE @date_from DATE = '20180101';
DECLARE @date_to DATE = '20181231';

WITH ordered AS (
    SELECT
        id, 
        percode, 
        [value],
        ROW_NUMBER() OVER (PARTITION BY percode, [value] ORDER BY [value]) AS order_id
    FROM
        @table
    WHERE
        [date] BETWEEN @date_from AND @date_to
        AND ISNULL(isdeleted, 0) != 1),
matches AS (
    SELECT 
        m1.id AS match_1_id,
        m2.id AS match_2_id 
    FROM 
        ordered m1
        INNER JOIN ordered m2 ON m1.percode = m2.percode AND m1.[value] = m2.[value] * -1 AND m1.order_id = m2.order_id)
UPDATE
    t
SET
    isdeleted = 1
FROM
    @table t
    INNER JOIN matches m ON m.match_1_id = t.id OR m.match_2_id = t.id;
SELECT * FROM @table;

结果:

id  date        percode value   isdeleted
1   2017-10-01  C1      10      NULL
2   2018-01-01  C1      -10     NULL
3   2018-02-15  C2      20      1
4   2018-03-10  C2      -20     1
5   2018-12-01  C3      15      NULL
6   2019-02-01  C3      -15     NULL

它是如何工作的?好吧,我把任务分成了几步:

  • 列出指定日期时间内的所有行,这些行尚未删除;
  • 对于每行数据,为其分配一个运行计数,按编码和值分组。所以第一个C1 10将是数字#1,然后第二个C1 10将是数字#2等;
  • 为了找到匹配,它只是找到任何具有相同代码的值,与另一个值组相等且相反的值,以及相同的运行计数值;
  • 如果匹配,则将isdeleted标志设置为1。

0
投票

这是我的代码,但这不是实时超过2亿行的性能。在现实生活中,Percode是5列(date,varchar(13),varchar(2),varchar(1)和varchar(50))的串联,Value是4个数字列。

我正在寻找其他想法。

--DECLARE @table TABLE (id INT, [date] DATE, percode CHAR(2), [value] INT, isdeleted BIT);
Select * INTO #MasterTable FROM
(
SELECT 1 id, '20171001' [date], 'C1' percode, 10 [value], NULL isdeleted
UNION ALL
SELECT 2, '20180101', 'C1', -10, NULL
UNION ALL
SELECT 3, '20180215', 'C2', 20, NULL
UNION ALL
SELECT 4, '20180310', 'C2', -20, NULL
UNION ALL
SELECT 5, '20181201', 'C3', 15, NULL
UNION ALL
SELECT 6, '20190201', 'C3', -15, NULL
) T ;

DECLARE @date_from DATE = '20180101';
DECLARE @date_to DATE = '20181231';

select F.id
Into #TmpTable
from 
(
select Id, PerCode, Value
,ROW_NUMBER() over (partition by PerCode, Value order by (select 0)) Rn2
from
#MasterTable ) F 
inner join (
select 
PerCode
, Rn1
from (
select  
PerCode
 ,Value

,ROW_NUMBER() over (partition by PerCode, Value order by (select 0)) Rn1
FROM #MasterTable
where
[date] BETWEEN @date_from AND @date_to
) A
group by PerCode , Rn1
having sum(Value) = 0  and count(*)>1

) B on                F.PerCode = B.PerCode
                  and F.Rn2 = B.Rn1


update  R
set IsDeleted = 1
from #MasterTable R
inner join #TmpTable P
on R.id = P.id

select * from #MasterTable

drop table #MasterTable ;
drop table #TmpTable;
© www.soinside.com 2019 - 2024. All rights reserved.