Postgresql Query执行缓慢

问题描述 投票:0回答:2

我有PostgreSql查询如下:

SELECT DISTINCT ON (reference) reference, reference_url 
FROM vehicles v 
WHERE NOT EXISTS 
    (select reference 
     from daily_run_vehicle rv 
     WHERE ((
           handled = False 
           AND retries >= 5 ) 
           OR rv.timestamp::timestamp::date = now()::date)  
     AND v.reference=reference);

其中vehicles表有大约400k记录,daily_run_vehicle表有大约5000万条记录。

因此,我需要所有车辆今天没有添加到daily_run_vehicle的车辆或处理列是False并重试column is >= 5

但问题是查询执行时间太长。

有没有办法更好地编写它以便更快地执行?

sql postgresql query-performance postgresql-performance
2个回答
0
投票

我有一个理论,它可能与调用now()函数数百万次有关。您可以通过运行此查询来验证

SELECT DISTINCT ON (reference) reference, reference_url 
FROM vehicles v 
WHERE NOT EXISTS 
    (select reference 
     from daily_run_vehicle rv 
     WHERE ((
           handled = False 
           AND retries >= 5 ) 
           OR rv.timestamp::timestamp::date = '2019-03-06')  
     AND v.reference=reference);

它的性能得到改善,你必须将今天的日期设置为一个变量并在查询中使用变量,这样现在只需要调用一次。如果你使用EXISTS,那么传统就是选择SELECT ... FROM ...你不关心这些值是否至少有一个或没有。


0
投票

嗯。我在想:

SELECT DISTINCT ON (v.reference) v.reference, v.reference_url 
FROM vehicles v 
WHERE NOT EXISTS (select 1 
                  from daily_run_vehicle rv 
                  where rv.reference = v.reference and
                        rv.handled = False and
                        rv.retries >= 5
                 ) and
      NOT EXISTS (select 1 
                  from daily_run_vehicle rv 
                  where rv.reference = v.reference and
                        rv.timestamp >= current_date::timestamp and
                        rv.timestamp >= (current_date + interval '1 day'::timestamp 
                 )
ORDER BY v.reference;

对于此查询,您需要索引:

  • daily_run_vehicle(reference, handled, retries)
  • daily_run_vehicle(reference, timestamp)
  • reference_url(reference, reference_url)
© www.soinside.com 2019 - 2024. All rights reserved.