感谢所有花时间发表评论和回答的人。
-
我有一个像这样的价格历史表(伪代码):
table price_history (
product_id,
price,
changed_date
)
其中存储了某些产品的历史价格:
1, 1.0, '2017-12-18'
1, 1.2, '2017-12-20'
1, 0.9, '2018-04-20'
1, 1.1, '2018-07-20'
1, 1.3, '2018-07-22'
2, 10.0, '2017-12-15'
2, 11.0, '2017-12-16'
2, 9.9, '2018-01-02'
2, 10.3, '2018-04-04
现在我想要在一定时期内的一些产品的价格。例如。在2018-01-01和现在之间。
简单的方法:
SELECT * FROM price_history
WHERE product_id in (1,2) AND changed_date >= 2018-01-01
不行,因为从2018-01-01到第一次价格变动的每件产品的个别价格不包括在内:
1, 0.9, '2018-04-20'
1, 1.1, '2018-07-20'
1, 1.3, '2018-07-22'
2, 9.9, '2018-01-02'
2, 10.3, '2018-04-04
但从一开始就了解价格至关重要。
所以,除了期间内的价格变动外,最后的变化也必须包括在内。结果应该是这样的:
1, 1.2, '2017-12-20'
1, 0.9, '2018-04-20'
1, 1.1, '2018-07-20'
1, 1.3, '2018-07-22'
2, 11.0, '2017-12-16'
2, 9.9, '2018-01-02'
2, 10.3, '2018-04-04
问:如何指定这样的select语句?
编辑:
测试场景和Ajay Gupta的解决方案
CREATE TABLE price_history (
product_id integer,
price float,
changed_date timestamp
);
INSERT INTO price_history (product_id,price,changed_date) VALUES
(1, 1.0, '2017-12-18'),
(1, 1.2, '2017-12-20'),
(1, 0.9, '2018-04-20'),
(1, 1.1, '2018-07-20'),
(1, 1.3, '2018-07-22'),
(2, 10.0, '2017-12-15'),
(2, 11.0, '2017-12-16'),
(2, 9.9, '2018-01-02'),
(2, 10.3, '2018-04-04');
获胜选择:
with cte1 as
(Select *, lag(changed_date,1,'01-01-1900')
over(partition by product_id order by changed_date)
as FromDate from price_history),
cte2 as (Select product_id, max(FromDate)
as changed_date from cte1
where '2018-01-01'
between FromDate and changed_date group by product_id)
Select p.* from price_history p
join cte2 c on p.product_id = c.product_id
where p.changed_date >= c.changed_date
order by product_id,changed_date;
结果:
product_id | price | changed_date
------------+-------+---------------------
1 | 1.2 | 2017-12-20 00:00:00
1 | 0.9 | 2018-04-20 00:00:00
1 | 1.1 | 2018-07-20 00:00:00
1 | 1.3 | 2018-07-22 00:00:00
2 | 11 | 2017-12-16 00:00:00
2 | 9.9 | 2018-01-02 00:00:00
2 | 10.3 | 2018-04-04 00:00:00
我必须承认,这超出了我有限的(PG-)SQL技能。
使用Lag
和cte
with cte1 as (
Select *,
lag(changed_date,1,'01-01-1900') over(partition by product_id order by changed_date) as FromDate
from price_history
), cte2 as (
Select product_id, max(FromDate) as changed_date
from cte1
where '2018-01-01' between FromDate and changed_date
group by product_id
)
Select p.*
from price_history p
join cte2 c on p.product_id = c.product_id
where p.changed_date >= c.changed_date;
我猜这就是你要找的东西
SELECT Top 1 * FROM price_history WHERE product_id in (1,2) AND changed_date < 2018-01-01
UNION ALL
SELECT * FROM price_history WHERE product_id in (1,2) AND changed_date >= 2018-01-01
您需要第一个更改日期和所有其他日期>“2018-01-01”
select product_id,price, changed_date
from
(
select product_id,price, changed_date,
row_number() over(partition by product_id order by changed_date ) as rn
from price_history
) x
where x.rn = 2 and product_id in (1,2);
union all
select product_id,price, changed_datefrom from price_history
where product_id in (1,2) and changed_date >= '2018-01-01'
如果您确实可以选择更改表结构,则另一种方法是在表中同时使用start_date和end_date,这样您的记录就不会依赖于上一行/下一行,并且查询变得更容易编写。见Slowly changing dimension - Type 2
如果你想解决现有结构的问题,在PostgresQL中你可以使用LIMIT 1
在changed_date之前获取最新记录:
SELECT
*
FROM
price_history
WHERE
product_id in (1,2)
AND changed_date >= '2018-01-01'
UNION ALL
-- this would give you the latest price before changed_date
SELECT
*
FROM
price_history
WHERE
product_id in (1,2)
AND changed_date < '2018-01-01'
ORDER BY
changed_date DESC
LIMIT 1
使用union
的解决方案仍然更简单,但在其他答案中没有正确实现。所以:
SELECT * FROM price_history
WHERE product_id in (1,2) AND changed_date >= '2018-01-01'
union all
(
select distinct on (product_id)
*
from price_history
where product_id in (1,2) AND changed_date < '2018-01-01'
order by product_id, changed_date desc)
order by product_id, changed_date;