Postgres SQL:获取列值在连续日期之间发生变化的行

问题描述 投票:0回答:1

我想收集与前一个日期相比已添加或删除了股票代码值的行。 我使用 LEFT JOIN 成功获取了“添加的列值”行,但正在努力获取已删除的行。 当前查询运行时间约为 10 秒。它选择特定的 ETF/指数(“index_of_ref”),并查找连续日期(“trade_date”)之间添加/删除的组件(“股票代码”):

SELECT 
    current_day.trade_date,
    current_day.ticker,
    CASE
        WHEN previous_day.ticker IS NULL THEN '+' -- Ticker added
        WHEN current_day.ticker IS NULL THEN '-' -- Ticker deleted
    END AS change_type
FROM
    t_etf_holdings AS current_day
LEFT JOIN
    t_etf_holdings AS previous_day 
    ON current_day.ticker = previous_day.ticker
    AND current_day.index_of_ref = previous_day.index_of_ref
    AND previous_day.trade_date = (
        SELECT MAX(trade_date)
        FROM t_etf_holdings
        WHERE trade_date < current_day.trade_date
        AND index_of_ref = 'ftse100'
--        AND ticker = current_day.ticker
    )
WHERE
    current_day.index_of_ref = 'ftse100'
AND previous_day.ticker IS NULL
--    AND (
--        (current_day.ticker IS NOT NULL AND previous_day.ticker IS NULL)
--        OR 
--      (current_day.ticker IS NULL AND previous_day.ticker IS NOT NULL)
--    )
ORDER BY
    current_day.trade_date DESC, current_day.ticker;

我相信我的 LEFT JOIN 表逻辑是错误的,并且无法获取已删除的记录。任何帮助将不胜感激。 我希望从 INNER JOIN 中获取任何代码。 FULL OUTER JOIN,没有按照我使用的方式工作......

我按照@bnk的建议做了一个工作示例:摆弄插入的数据 结果是:

trade_date  ticker  change_type
2024-01-05  GLEN    +
2024-01-04  AAF +
2024-01-04  AAL +
2024-01-04  WPP +

我想要:

trade_date  ticker  change_type
2024-01-06  GLEN    -
2024-01-05  GLEN    +
2024-01-05  WPP     -
2024-01-04  AAF     +
2024-01-04  AAL     +
2024-01-04  WPP     +

以下是如何在 psql12 上重现:

创建stmt

 CREATE TABLE IF NOT EXISTS public.t_etf_holdings
(
    trade_date date NOT NULL,
    index_of_ref character varying(25) COLLATE pg_catalog."default" NOT NULL,
    ticker character varying(25) COLLATE pg_catalog."default" NOT NULL,
    CONSTRAINT t_etf_holdings_pkey PRIMARY KEY (trade_date, index_of_ref, ticker)
);

插入stmt

INSERT INTO t_etf_holdings (trade_date, index_of_ref, ticker)
VALUES('2024-01-06', 'ftse100', 'AAF');
INSERT INTO t_etf_holdings (trade_date, index_of_ref, ticker)
VALUES('2024-01-06', 'ftse100', 'AAL');
INSERT INTO t_etf_holdings (trade_date, index_of_ref, ticker)
VALUES('2024-01-05', 'ftse100', 'AAF');
INSERT INTO t_etf_holdings (trade_date, index_of_ref, ticker)
VALUES('2024-01-05', 'ftse100', 'AAL');
INSERT INTO t_etf_holdings (trade_date, index_of_ref, ticker)
VALUES('2024-01-05', 'ftse100', 'GLEN');
INSERT INTO t_etf_holdings (trade_date, index_of_ref, ticker)
VALUES('2024-01-04', 'ftse100', 'AAF');
INSERT INTO t_etf_holdings (trade_date, index_of_ref, ticker)
VALUES('2024-01-04', 'ftse100', 'AAL');
INSERT INTO t_etf_holdings (trade_date, index_of_ref, ticker)
VALUES('2024-01-04', 'ftse100', 'WPP');

查看 feedle 图片:

postgresql duplicates left-join
1个回答
0
投票

您必须使用 FULL OUTER JOIN 和条件逻辑来识别 ETF/指数数据集中连续交易日期之间的代码并将其标记为“添加”或“删除”。

WITH PreviousDay AS (
    SELECT trade_date, index_of_ref, ticker
    FROM t_etf_holdings
    WHERE index_of_ref = 'ftse100'
), CurrentDay AS (
    SELECT trade_date, index_of_ref, ticker
    FROM t_etf_holdings
    WHERE index_of_ref = 'ftse100'
)
SELECT 
    COALESCE(cd.trade_date, pd.trade_date) AS trade_date,
    COALESCE(cd.ticker, pd.ticker) AS ticker,
    CASE
        WHEN pd.ticker IS NULL THEN '+'
        WHEN cd.ticker IS NULL THEN '-'
    END AS change_type
FROM CurrentDay cd
FULL OUTER JOIN PreviousDay pd
ON cd.ticker = pd.ticker
AND cd.index_of_ref = pd.index_of_ref
AND pd.trade_date = (
    SELECT MAX(trade_date)
    FROM t_etf_holdings
    WHERE trade_date < cd.trade_date
    AND index_of_ref = 'ftse100'
)
WHERE cd.trade_date IS NULL OR pd.trade_date IS NULL
ORDER BY trade_date DESC, ticker;

结果会是

trade_date  ticker  change_type
2024-01-06  AAF -
2024-01-06  AAL -
2024-01-05  GLEN    +
2024-01-05  GLEN    -
2024-01-04  AAF +
2024-01-04  AAL +
2024-01-04  WPP +
2024-01-04  WPP -
© www.soinside.com 2019 - 2024. All rights reserved.