在 TimescaleDB 中启用压缩后出现查询性能问题

问题描述 投票:0回答:1

我们一直在尝试使用 TimescaleDB,主要是通过压缩来解决磁盘空间问题。但是,我们注意到启用压缩后查询执行时间增加了。

SELECT
  *
FROM
  actions
WHERE
  task = '5uTTiGqdqLKUFwo6WXTC1V8tM6PekW6gmDGaKrNPPfNz'
  AND kind = 'CALL'
  AND args ->> 'method_name' = 'open'

# Postgresql table
Planning Time: 1.057 ms
Execution Time: 0.088 ms

# Hypertable with compression enabled
Planning Time: 33.903 ms
Execution Time: 1860.641 ms

表架构:

Table "public.actions"
Column     |    Type     | Collation | Nullable | Default
-------------------------------------+-------------+-----------+----------+---------
task       | text        |           | not null |
index      | integer     |           | not null |
from       | text        |           |          |
to         | text        |           |          |
created_at | bigint      |           | not null |
kind       | action_kind |           |          |
args       | jsonb       |           |          |
Indexes:
"actions_created_at_idx" btree (created_at DESC)
"actions_task_args_method_name_idx" btree (task, (args ->> 'method_name'::text)) WHERE kind = 'CALL'::kind
"actions_task_index_idx" UNIQUE CONSTRAINT, btree (task, index, created_at)

用于创建超表和设置压缩策略的查询:

SELECT
  create_hypertable(
    'actions',
    'created_at',
    migrate_data = > true,
    chunk_time_interval = > 604800000000000
  );

ALTER TABLE
  actions_hypertable_chunk
SET
  (timescaledb.compress);

SELECT
  add_compression_policy(
    'actions',
    BIGINT '604800000000000'
  );

查询计划:

QUERY PLAN
Append  (cost=0.04..64086.09 rows=12709000 width=14) (actual time=0.470..1859.661 rows=1 loops=1)
  Buffers: shared hit=276533 read=28627
  I/O Timings: shared/local read=153.380
  ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_3_25_chunk  (cost=0.04..12.90 rows=290000 width=10) (actual time=0.149..0.150 rows=0 loops=1)
        Output: _hyper_3_25_chunk.to
        Filter: ((_hyper_3_25_chunk.task = '5uTTiGqdqLKUFwo6WXTC1V8tM6PekW6gmDGaKrNPPfNz'::text) AND (_hyper_3_25_chunk.kind = 'CALL'::kind) AND ((_hyper_3_25_chunk.args ->> 'method_name'::text) = 'open'::text))
        Rows Removed by Filter: 42
        Buffers: shared hit=34
        ->  Seq Scan on _timescaledb_internal.compress_hyper_5_69_chunk  (cost=0.00..12.90 rows=290 width=132) (actual time=0.008..0.009 rows=1 loops=1)
              Output: compress_hyper_5_69_chunk.task, compress_hyper_5_69_chunk.index, compress_hyper_5_69_chunk.from, compress_hyper_5_69_chunk.to, compress_hyper_5_69_chunk.created_at, compress_hyper_5_69_chunk.kind, compress_hyper_5_69_chunk.args, compress_hyper_5_69_chunk._ts_meta_count, compress_hyper_5_69_chunk._ts_meta_sequence_num, compress_hyper_5_69_chunk._ts_meta_min_1, compress_hyper_5_69_chunk._ts_meta_max_1
              Buffers: shared hit=1

  ...

  ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_3_68_chunk  (cost=0.03..7.13 rows=213000 width=18) (actual time=132.736..132.736 rows=0 loops=1)
        Output: _hyper_3_68_chunk.to
        Filter: ((_hyper_3_68_chunk.task = '5uTTiGqdqLKUFwo6WXTC1V8tM6PekW6gmDGaKrNPPfNz'::text) AND (_hyper_3_68_chunk.kind = 'CALL'::kind) AND ((_hyper_3_68_chunk.args ->> 'method_name'::text) = 'open'::text))
        Rows Removed by Filter: 212419
        Buffers: shared hit=24692
        ->  Seq Scan on _timescaledb_internal.compress_hyper_5_112_chunk  (cost=0.00..7.13 rows=213 width=132) (actual time=0.010..0.087 rows=213 loops=1)
              Output: compress_hyper_5_112_chunk.task, compress_hyper_5_112_chunk.index, compress_hyper_5_112_chunk.from, compress_hyper_5_112_chunk.to, compress_hyper_5_112_chunk.created_at, compress_hyper_5_112_chunk.kind, compress_hyper_5_112_chunk.args, compress_hyper_5_112_chunk._ts_meta_count, compress_hyper_5_112_chunk._ts_meta_sequence_num, compress_hyper_5_112_chunk._ts_meta_min_1, compress_hyper_5_112_chunk._ts_meta_max_1
              Buffers: shared hit=5
Query Identifier: -1707176995114799452
Planning:
  Buffers: shared hit=12972 dirtied=1
Planning Time: 33.903 ms
Execution Time: 1860.641 ms

我们注意到执行计划中块的解压。那是问题吗?有人对如何提高查询执行时间有什么建议吗?

postgresql compression database-performance timescaledb
1个回答
0
投票

为了避免解压缩所有数据以仅检查几列的状况,请将此列设置为 segmentby 选项。

始终记得配置

segmentby
orderby
,以便稍后获得更好的查询性能。

ALTER TABLE
  actions_hypertable_chunk
SET
  (timescaledb.compress, timescaledb.compress_segmentby="task, kind"

对于似乎总是等于

(_hyper_3_68_chunk.args ->> 'method_name'::text))
"open"
,也许条件索引会更好。

© www.soinside.com 2019 - 2024. All rights reserved.