如何比较文本随时间的变化?

问题描述 投票:0回答:3

我不太清楚如何捕捉随着时间的推移发生的变化。我有以下内容:

db<>小提琴

CREATE TABLE term_changes (load_id int, merchant_id int, load_date date, terms varchar(50));

INSERT INTO term_changes (load_id, merchant_id, load_date, terms) VALUES 
  (1, 1, '2023-01-05', 'Roses are red'),
  (2, 2, '2023-01-05', 'Roses are blue'),
  (3, 1, '2023-01-06', 'Roses are red'),
  (4, 2, '2023-01-06', 'Roses are blue'),
  (5, 1, '2023-01-07', 'Roses are violet'),
  (6, 2, '2023-01-07', 'Roses are blue'),
  (7, 1, '2023-01-08', 'Roses are violet'),
  (8, 2, '2023-01-08', 'Roses are yellow');

WITH t1 AS (SELECT load_id, merchant_id, load_date, MD5(terms) AS terms
            FROM term_changes
            ORDER BY merchant_id, load_id),
t2 AS (SELECT load_id, 
              merchant_id, 
              load_date, 
              terms, 
              LAG(load_id, 1) OVER (PARTITION BY merchant_id 
                                    ORDER BY load_id) AS prev_load_id
       FROM t1)
SELECT *
FROM   t2 
       JOIN t1 ON t1.load_id = t2.prev_load_id
                  AND t1.merchant_id = t2.merchant_id
                  AND t1.terms != t2.terms

哪个回报

加载_id 商户id 加载日期 条款 prev_load_id 加载_id 商户id 加载日期 条款
5 1 2023-01-07 84df2c2124ad3fc5c8cdf76ce1d7f3e3 3 3 1 2023-01-06 6becb043847fefb01e7989034cbdb136
8 2 2023-01-08 90b64ad67da829652ee622e0695748fc 6 6 2 2023-01-07 0e3eb8ff97b31e8874f7b51c23f242a2

我之后的输出通过merchant_id捕获当前

更改
值:

加载_id 商户id 加载日期 条款
1 1 2023-01-05 玫瑰是红色的
5 1 2023-01-07 玫瑰是紫罗兰色的
2 2 2023-01-05 玫瑰是蓝色的
8 2 2023-01-08 玫瑰是黄色的
  • 使用
    load_id
    ,因为每个日期可能有多个条目
sql mysql window-functions mysql-8.0
3个回答
0
投票

您可以使用此查询:

WITH t1 AS (
  SELECT 
    load_id, 
    merchant_id, 
    load_date, 
    terms, 
    LAG(terms) OVER (PARTITION BY merchant_id ORDER BY load_date) AS prev_terms
  FROM term_changes
),
t2 AS (
  SELECT 
    load_id, 
    merchant_id, 
    load_date, 
    terms 
  FROM t1 
  WHERE prev_terms IS NULL OR prev_terms != terms
)
SELECT *
FROM t2
ORDER BY merchant_id, load_date;

小提琴


0
投票

看起来我只需要更改为

LEFT JOIN
移动谓词:

CREATE TABLE term_changes (load_id int, merchant_id int, load_date date, terms varchar(50));

INSERT INTO term_changes (load_id, merchant_id, load_date, terms) VALUES 
  (1, 1, '2023-01-05', 'Roses are red'),
  (2, 2, '2023-01-05', 'Roses are blue'),
  (3, 1, '2023-01-06', 'Roses are red'),
  (4, 2, '2023-01-06', 'Roses are blue'),
  (5, 1, '2023-01-07', 'Roses are violet'),
  (6, 2, '2023-01-07', 'Roses are blue'),
  (7, 1, '2023-01-08', 'Roses are violet'),
  (8, 2, '2023-01-08', 'Roses are yellow'),
  (9, 3, '2023-01-05', 'Roses are green'),
  (10, 3, '2023-01-06', 'Roses are green'),
  (11, 3, '2023-01-07', 'Roses are green'),
  (12, 3, '2023-01-08', 'Roses are green');


WITH t1 AS (SELECT load_id, 
                   merchant_id, 
                   load_date, 
                   terms, 
                   MD5(terms) AS terms_hashed
            FROM term_changes
            ORDER BY merchant_id, load_id),
t2 AS (SELECT load_id, 
              merchant_id, 
              load_date, 
              terms, 
              terms_hashed, 
              LAG(load_id, 1) OVER (PARTITION BY merchant_id 
                                    ORDER BY load_id) AS prev_load_id
       FROM t1)
SELECT t2.load_id,
       t2.merchant_id,
       t2.load_date,
       t2.terms
FROM   t2 
       LEFT JOIN t1 ON t1.load_id = t2.prev_load_id
                       AND t1.merchant_id = t2.merchant_id
WHERE  ( t1.terms_hashed != t2.terms_hashed
         OR t2.prev_load_id IS NULL )
加载_id 商户id 加载日期 条款
1 1 2023-01-05 玫瑰是红色的
5 1 2023-01-07 玫瑰是紫罗兰色的
2 2 2023-01-05 玫瑰是蓝色的
8 2 2023-01-08 玫瑰是黄色的
9 3 2023-01-05 玫瑰是绿色的

小提琴


0
投票

你让事情变得比需要的复杂得多。

小提琴

select load_id, merchant_id, load_date, terms, unchanged
from (
    select load_id, merchant_id, load_date, terms,
        coalesce(terms = lag(terms) over (partition by merchant_id order by load_id), 0) unchanged
    from term_changes
) term_changes_plus
where not unchanged
order by merchant_id, load_id
© www.soinside.com 2019 - 2024. All rights reserved.