即使我有索引,为什么 Postgres DB 仍要排序?

问题描述 投票:0回答:1

这是表架构:

CREATE TABLE IF NOT EXISTS events_table
(
      id                    UUID NOT NULL,
      company_id            INT8 NOT NULL,
      employee_id           INT8 NOT NULL,
      category_type_id      UUID, 
      event_type            VARCHAR(255) NOT NULL, 
      data_payload          bytea NOT NULL, 
      occurred_at     TIMESTAMP WITH TIME ZONE NOT NULL,
      stored_at       TIMESTAMP WITH TIME ZONE NOT NULL, 

      PRIMARY KEY (id)
);

我们有以下索引:

CREATE INDEX events_idx ON events_table (company_id, employee_id, category_type_id, occurred_at, stored_at);

这是查询:

SELECT 
    table_name.id,
    table_name.company_id,
    table_name.employee_id,
    table_name.category_type_id,
    table_name.type,
    table_name.payload,
    table_name.occurred_at,
    table_name.stored_at
FROM 
    schema_name.table_name
WHERE 
    (table_name.company_id = ? 
    AND table_name.employee_id IN (?) 
    AND table_name.category_type_id = cast ( ? )
    AND table_name.occurred_at <= CAST(? AS TIMESTAMP WITH TIME ZONE))

ORDER BY 
    table_name.occurred_at ASC, 
    table_name.stored_at ASC;

为什么首先要进行DB排序:

 Gather Merge  (cost=116700.30..129758.08 rows=111916 width=122) (actual time=357.017..512.239 rows=295813 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Sort  (cost=115700.28..115840.17 rows=55958 width=122) (actual time=345.202..364.049 rows=98604 loops=3)
         Sort Key: occurred_at, stored_at
         Sort Method: external merge  Disk: 13760kB
         Worker 0:  Sort Method: external merge  Disk: 6280kB
         Worker 1:  Sort Method: external merge  Disk: 6120kB
         ->  Parallel Bitmap Heap Scan on balance_state_event  (cost=6311.95..107650.92 rows=55958 width=122) (actual time=51.027..280.405 rows=98604 loops=3)
               Recheck Cond: ((company_id = 100) AND (category_type_id = '0baee077-c735-4f95-bd26-0c6b4b1eddd4'::uuid) AND (employee_id = ANY ('{17,1,19,2,18,11,15,16,20,861,8,12,4,3,5,13,7,6,14,9,10,1021,1022,1023,1024,1025,1026
,1027,1028,1029,1030,1031,1032,1033,1034,1035,1036,1037,1038,1039,1040,1041,1042,1043,1044,1045,1046,1047,1048,1049,1050,1051,1052,1053,1054,1055,1056,1057,1058,1059,1060,1061,1062,1063,1064,1065,1066,1067,1068,1069,1070,1071,10
72,1073,1074,1075,1076,1077,1078,1079,1080,1081,1082,1083,1084,1085,1086,1087,1088,1089,1090,1091,1092,1093,1094,1095,1096,1097,1098,1099,1100}'::bigint[])) AND (occurred_at <= '2024-05-03 00:00:00+00'::timestamp with time zone)
)
               Rows Removed by Index Recheck: 64790
               Heap Blocks: exact=21346 lossy=17655
               ->  Bitmap Index Scan on events_idx  (cost=0.00..6278.13 rows=134298 width=0) (actual time=49.032..49.032 rows=295813 loops=1)
                     Index Cond: ((company_id = 100) AND (absence_type_id = '0baee077-c735-4f95-bd26-0c6b4b1eddd4'::uuid) AND (employee_id = ANY ('{17,1,19,2,18,11,15,16,20,861,8,12,4,3,5,13,7,6,14,9,10,1021,1022,1023,1024,1025,
1026,1027,1028,1029,1030,1031,1032,1033,1034,1035,1036,1037,1038,1039,1040,1041,1042,1043,1044,1045,1046,1047,1048,1049,1050,1051,1052,1053,1054,1055,1056,1057,1058,1059,1060,1061,1062,1063,1064,1065,1066,1067,1068,1069,1070,107
1,1072,1073,1074,1075,1076,1077,1078,1079,1080,1081,1082,1083,1084,1085,1086,1087,1088,1089,1090,1091,1092,1093,1094,1095,1096,1097,1098,1099,1100}'::bigint[])) AND (occurred_at <= '2024-05-03 00:00:00+00'::timestamp with time z
one))
 Planning Time: 0.445 ms
 Execution Time: 528.740 ms
(16 rows)

首先对balance_state_event进行并行位图堆扫描?

不应该排序,因为索引保持排序顺序。

postgresql indexing
1个回答
0
投票

需要显式排序的原因是索引扫描不一定按排序顺序返回行。原因是

IN
条件。

如果您在

(a, b, c)
上有一个三列索引,则以下查询不需要需要显式排序,因为索引扫描按排序顺序返回行:

SELECT ... WHERE a = 42 AND b = 'something' ORDER BY c

但是这个查询需要显式排序:

SELECT ... WHERE a 0 42 AND b IN ('something', 'other') ORDER BY c
© www.soinside.com 2019 - 2024. All rights reserved.