带有长位图堆扫描“I/O 时间：读取”时间的慢 postgres 查询

Question

嗨，我有一个包含 10 个表的数据库。每个表有大约 0.5-10 亿行，按范围分区然后散列（10x10=100 个分区）。它在下面用于搜索的列 (

id

) 上建立了索引。数据库托管在 Azure PostgreSQL 单服务器上。

测试查询显示大部分时间用于“I/O Timings: read”：

postgres=> EXPLAIN (ANALYZE, BUFFERS) select count(*) from table_id4 where id=654321;
                                                                          QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=7458.09..7458.10 rows=1 width=8) (actual time=21141.393..21141.396 rows=1 loops=1)
   Buffers: shared read=2256
   I/O Timings: read=21096.814
   ->  Append  (cost=41.26..7452.66 rows=2171 width=0) (actual time=197.168..21138.495 rows=2247 loops=1)
         Buffers: shared read=2256
         I/O Timings: read=21096.814
         ->  Bitmap Heap Scan on table_id4_r2_h5  (cost=41.26..7441.80 rows=2171 width=0) (actual time=197.167..21137.471 rows=2247 loops=1)
               Recheck Cond: (id = 244730)
               Heap Blocks: exact=2247
               Buffers: shared read=2256
               I/O Timings: read=21096.814
               ->  Bitmap Index Scan on table_id4_r2_h5_id_idx  (cost=0.00..40.72 rows=2171 width=0) (actual time=117.586..117.586 rows=2247 loops=1)
                     Index Cond: (id = 244730)
                     Buffers: shared read=9
                     I/O Timings: read=116.929
 Planning Time: 2.882 ms
 Execution Time: 21141.449 ms
(17 rows)

我做了一个批量测试，在同一个循环中显示了相当不同的查询时间：

FOR idx IN SELECT (random()*total_IDs)::int AS id from generate_series (1,10)
LOOP ... 
select count(*) from table_id4 where id=idx;  
...
END LOOP;
NOTICE:   id: 321158 count#: 2154,   time: 46.734967s
NOTICE:   id: 487596 count#: 2238,   time: 0.968759s
NOTICE:   id: 548334 count#: 2180,   time: 1.062516s
NOTICE:   id: 404978 count#: 2179,   time: 29.750295s
NOTICE:   id: 370904 count#: 2123,   time: 22.203384s
NOTICE:   id: 228857 count#: 2223,   time: 29.094126s
NOTICE:   id: 327134 count#: 2169,   time: 24.750242s
NOTICE:   id: 372101 count#: 2180,   time: 28.062825s
NOTICE:   id: 341814 count#: 2130,   time: 30.250353s
NOTICE:   id: 248316 count#: 2195,   time: 32.375377s

但是如果我对相同的 id 重复查询，那么时间就会变得理想

ms

：

psql -c " ...
select count(*) from table_id4 where pt_id=321158;
select count(*) from table_id4 where pt_id=487596;
select count(*) from table_id4 where pt_id=548334;
select count(*) from table_id4 where pt_id=404978;
select count(*) from table_id4 where pt_id=370904;
select count(*) from table_id4 where pt_id=228857;
"
Time: 5267.168 ms (00:05.267)
Time: 171.925 ms
Time: 24.942 ms
Time: 11.387 ms
Time: 6.753 ms
Time: 17.573 ms

其他表在查询时的行为类似，这里是表的详细信息，

postgres=> \d+ table_id4
                          Unlogged table "table_id4"
   Column    |   Type   | Collation | Nullable | Default | Storage | Stats target | Description
-------------+----------+-----------+----------+---------+---------+--------------+-------------
 date        | date     |           | not null |         | plain   |              |
 field1      | real     |           |          |         | plain   |              |
 field2      | real     |           |          |         | plain   |              |
 field3      | smallint |           |          |         | plain   |              |
 id          | integer  |           | not null |         | plain   |              |
Partition key: RANGE (id)
Indexes:
    "table_id4_date_idx" btree (date)
    "table_id4_id_idx" btree (id)
Partitions: table_id4_r1 FOR VALUES FROM (0) TO (1...5), PARTITIONED,
            table_id4_r10 FOR VALUES FROM (...93) TO (MAXVALUE), PARTITIONED,
            table_id4_r2 FOR VALUES FROM (1...) TO (3...), PARTITIONED,
            table_id4_r3 FOR VALUES FROM (3...) TO (4...), PARTITIONED,
            table_id4_r4 FOR VALUES FROM (4...) TO (6...), PARTITIONED,
            table_id4_r5 FOR VALUES FROM (6...) TO (7...), PARTITIONED,
            table_id4_r6 FOR VALUES FROM (7...) TO (9...), PARTITIONED,
            table_id4_r7 FOR VALUES FROM (9...) TO (1...1), PARTITIONED,
            table_id4_r8 FOR VALUES FROM (1...1) TO (...32), PARTITIONED,
            table_id4_r9 FOR VALUES FROM (1...2) TO (...93), PARTITIONED

我在here和here上看到一些类似的讨论，但我对数据库不是很熟悉而且那些表非常大，所以想了解更多关于

REINDEX

或

VACUUM

等之前的问题，哪个可能需要几天才能完成。

[更新]：Azure门户上的psql服务器资源监视器显示MAX使用

[cpu,memory,storage]: 55%, 30%, <5%

。所以似乎资源不是问题？一些服务器参数：

CPU: vCore 2
total memory: 4GB
storage: 1Tb
shared_buffers: 512MB
work_mem: 4MB  (changed to 256MB but still not work)
max_parallel_workers: 10
max_parallel_maintenance_workers: 8

LOGGED

和

ENABLE TRIGGER ALL

会有帮助吗？任何建议表示赞赏！

带有长位图堆扫描“I/O 时间：读取”时间的慢 postgres 查询

问题描述投票：0回答：0

最新问题

带有长位图堆扫描“I/O 时间：读取”时间的慢 postgres 查询

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0