如何优化Postgres中GIN和B-Tree索引表的COUNT查询？

Question

我有一个 Postgres 12 表

foo

，看起来像这样：

专栏	类型	索引类型
创建于	无时区的时间戳	b树
电子邮件	文字	杜松子酒（电子邮件gin_trgm_ops）
留言	文字	gin（消息gin_trgm_ops）

我没有在此表上创建任何其他索引。实际表有超过1000万行

我的查询是动态生成的，主要查询是

SELECT COUNT(*) FROM foo

条件可能会有所不同。有时它没有任何条件，但有时它可以包含这样的条件：

WHERE 
LOWER(email) LIKE LOWER('%<parameter here>%') 
AND LOWER(message) LIKE LOWER('%<parameter here>%') 
AND createdAt > '<timestamp here>' 
AND created < '<timestamp here>'

这四种情况是最复杂的情况。大多数情况仅比较

email

和

message

。

我注意到条件查询总是比无条件查询慢得多。如何提高性能，特别是考虑到

email

和

message

经常一起查询？ GIN 的多列索引是可行的解决方案吗？会带来多少改进？

[编辑] 这是较小实例的

explain (analyze, verbose, buffers)

的输出。

explain(analyze, verbose, buffers) select count(*) from public."foo" where lower(message) like lower('%acme.com%') and lower(email) like lower('%acme.com%');
                                                             QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=33139.84..33139.85 rows=1 width=8) (actual time=183.940..185.917 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=29989
   ->  Gather  (cost=1000.00..33139.83 rows=1 width=0) (actual time=183.935..185.912 rows=0 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=29989
         ->  Parallel Seq Scan on public."foo"  (cost=0.00..32139.73 rows=1 width=0) (actual time=179.472..179.473 rows=0 loops=3)
               Filter: ((lower("foo".message) ~~ '%acme.com%'::text) AND (lower("foo".email) ~~ '%acme.com%'::text))
               Rows Removed by Filter: 86166
               Buffers: shared hit=29989
               Worker 0: actual time=177.051..177.051 rows=0 loops=1
                 Buffers: shared hit=10517
               Worker 1: actual time=180.233..180.233 rows=0 loops=1
                 Buffers: shared hit=9736
 Planning Time: 0.090 ms
 Execution Time: 185.950 ms
(17 rows)

[编辑] 2

\d+ public."foo"

的部分输出

%%%%=> \d+ public."foo";
                                                       Table "public.foo"
    Column     |              Type              | Collation | Nullable |      Default      | Storage  | Stats target | Description
---------------+--------------------------------+-----------+----------+-------------------+----------+--------------+-------------
 id            | text                           |           | not null |                   | extended |              |
 createdAt     | timestamp(3) without time zone |           | not null | CURRENT_TIMESTAMP | plain    |              |
 email         | text                           |           | not null |                   | extended |              |
 message       | text                           |           | not null |                   | extended |              |

Indexes:
    "foo_pkey" PRIMARY KEY, btree (id)
    "foo_createdAt_idx" btree ("createdAt")
    "foo_email_idx" gin (email gin_trgm_ops)
    "foo_message_idx" gin (message gin_trgm_ops)
Access method: heap

Answer 1

PostgreSQL 有不区分大小写的 ILIKE 运算符，三元组索引本身就支持它。另一方面，

LOWER(.) LIKE LOWER LOWER (.)

本身并不支持。您可以构建一个表达式索引来支持它，但是没有什么意义，因为 ILIKE 更短、更快，并且（在我看来）更容易理解。

如何优化Postgres中GIN和B-Tree索引表的COUNT查询？

问题描述投票：0回答：1

1个回答

最新问题

如何优化Postgres中GIN和B-Tree索引表的COUNT查询？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1