如果我用返回1行的简单查询替换where子句中的数字文字，PostgreSQL将无法正确修剪分区

Question

快速查询

select ...
from table1 t1
join table2 t2 on t2.org_id = t1.org_id
where t1.org_id = 1

慢查询

select ...
from table1 t1
join table2 t2 on t2.org_id = t1.org_id
where t1.org_id = (select org_id from table3 where org_name = "abc" limit 1)

两个查询的唯一区别是用子查询替换了文字。我已经在带有RDS的AWS上的PostgreSQL 12.2和11.6上进行了尝试。 table1和table2都在org_id列上分区。 table3有一个主键org_id和一个唯一的索引在org_name上。 “限制1”已添加到慢查询的子查询中，以尝试帮助优化器。

对于大多数组织，快速查询会在10秒内返回。对于大多数组织来说，缓慢的查询需要30-100秒。

我尝试过将分区大小设置为128、256、384、512、1024、2048和4096，最好是384。

快速查询的解释分析计划为15行，并且仅使用1个分区。慢查询的解释计划是对384个分区使用2,388行，并且似乎仅使用1个分区，但它考虑了所有分区。

Answer 1

您可以尝试创建SQL稳定函数来替换子查询。我在PostgreSQL 12.2中有以下情况：

EXPLAIN ANALYZE 
select * 
from table1 t1
join table2 t2 on t2.org_id = t1.org_id
where t1.org_id = 1;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..67.87 rows=78 width=44) (actual time=0.017..0.017 rows=0 loops=1)
   ->  Seq Scan on table2 t2  (cost=0.00..41.88 rows=13 width=4) (actual time=0.011..0.012 rows=1 loops=1)
         Filter: (org_id = 1)
   ->  Materialize  (cost=0.00..25.03 rows=6 width=40) (actual time=0.003..0.003 rows=0 loops=1)
         ->  Seq Scan on part1 t1  (cost=0.00..25.00 rows=6 width=40) (actual time=0.001..0.002 rows=0 loops=1)
               Filter: (org_id = 1)
 Planning Time: 0.432 ms
 Execution Time: 0.046 ms
(8 rows)


EXPLAIN ANALYZE 
select * 
from table1 t1
join table2 t2 on t2.org_id = t1.org_id
where t1.org_id = (select org_id from table3 where org_name = 'abc' limit 1);
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=4.31..176.25 rows=390 width=44) (actual time=0.023..0.023 rows=0 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.00..4.31 rows=1 width=4) (actual time=0.013..0.013 rows=1 loops=1)
           ->  Seq Scan on table3  (cost=0.00..25.88 rows=6 width=4) (actual time=0.010..0.010 rows=1 loops=1)
                 Filter: (org_name = 'abc'::text)
   ->  Append  (cost=0.00..125.15 rows=30 width=40) (actual time=0.022..0.023 rows=0 loops=1)
         ->  Seq Scan on part1 t1  (cost=0.00..25.00 rows=6 width=40) (actual time=0.002..0.002 rows=0 loops=1)
               Filter: (org_id = $0)
         ->  Seq Scan on part2 t1_1  (cost=0.00..25.00 rows=6 width=40) (never executed)
               Filter: (org_id = $0)
         ->  Seq Scan on part3 t1_2  (cost=0.00..25.00 rows=6 width=40) (never executed)
               Filter: (org_id = $0)
         ->  Seq Scan on part4 t1_3  (cost=0.00..25.00 rows=6 width=40) (never executed)
               Filter: (org_id = $0)
         ->  Seq Scan on part5 t1_4  (cost=0.00..25.00 rows=6 width=40) (never executed)
               Filter: (org_id = $0)
   ->  Materialize  (cost=0.00..41.94 rows=13 width=4) (never executed)
         ->  Seq Scan on table2 t2  (cost=0.00..41.88 rows=13 width=4) (never executed)
               Filter: (org_id = $0)
 Planning Time: 0.397 ms
 Execution Time: 0.129 ms
(21 rows)

create function f_get_org_id() returns int
language sql
stable
as
$$
select org_id from table3 where org_name = 'abc' limit 1
$$
;
CREATE FUNCTION

EXPLAIN ANALYZE 
select * 
from table1 t1
join table2 t2 on t2.org_id = t1.org_id
where t1.org_id = f_get_org_id()
                                                   QUERY PLAN                                                    
-----------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..2309.43 rows=390 width=44) (actual time=0.003..0.003 rows=0 loops=1)
   ->  Append  (cost=0.00..1625.15 rows=30 width=40) (actual time=0.003..0.003 rows=0 loops=1)
         Subplans Removed: 4
         ->  Seq Scan on part1 t1  (cost=0.00..325.00 rows=6 width=40) (actual time=0.002..0.002 rows=0 loops=1)
               Filter: (org_id = f_get_org_id())
   ->  Materialize  (cost=0.00..679.44 rows=13 width=4) (never executed)
         ->  Seq Scan on table2 t2  (cost=0.00..679.38 rows=13 width=4) (never executed)
               Filter: (org_id = f_get_org_id())
 Planning Time: 0.655 ms
 Execution Time: 0.091 ms
(10 rows)

如果我用返回1行的简单查询替换where子句中的数字文字，PostgreSQL将无法正确修剪分区

问题描述投票：1回答：1

1个回答

最新问题

如果我用返回1行的简单查询替换where子句中的数字文字，PostgreSQL将无法正确修剪分区

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1