当 where 子句不在 ID 上时,PostgreSQL 查询速度变慢

问题描述 投票:0回答:1

我正在使用具有以下逻辑的 postgresql (15.5) 数据库:

我们有测量用电量的“仪表”。 我们有“计算机仪表”,可以计算不同仪表的测量值。例如“compatedMeterA”=“meterA”+“meterB”

我有经典的计量表“维度”(2500 条记录)和计算机计量表,其中包含 id、名称……以及包含每个计量表每小时数据的事实_消耗。 (约2000万条记录)

计算的仪表组合存储在参数表中,其结构如下:

create table param_computed_meters (
    id                integer not null
        constraint param_computed_meters_pk
            primary key,
    id_computed_meter integer
        constraint param_computed_meters_id_computed_meter_fk
            references powerbi.dim_computed_meter
            on delete cascade,
    id_meter          integer,
    factor            double precision );

我创建了一个视图,用于计算计算仪表的消耗量,如下所示:

create view v_computed_consumption
            (ts, id_computed_meter, conso_with_factor, available_meters) as
SELECT fact.ts,
       params.id_computed_meter,
       sum(fact.gross_consumption * compteur.conversion_factor * params.factor) AS conso_with_factor
FROM param_computed_meters params
         JOIN fact_consumption fact ON params.id_meter = fact.id_meter
         JOIN dim_meter compteur ON params.id_meter = compteur.id_meter
         JOIN dim_computed_meter cm ON cm.id_computed_meter = params.id_computed_meter
GROUP BY fact.ts, params.id_site, params.id_computed_meter;

所以这就是奇怪的时候:

当我像这样将我的视图(事实)加入到计算仪表维度时,我获得了出色的性能(<1 sec):

  select * 
  from v_computed_consumption v 
  join dim_computed_meter cm 
     on (v.id_computed_meter = cm.id_computed_meter) 
  where cm.id_computed_meter = 71 
  order by ts desc;

当我过滤计算仪表的名称而不是 id 时,性能急剧下降(大约 1 分钟...):

  select * 
  from v_computed_consumption v 
  join dim_computed_meter cm 
    on (v.id_computed_meter = cm.id_computed_meter) 
  where cm.computed_meter_name = 'General Water D' 
  order by ts desc;

所以我试着让它“不那么愚蠢”,性能再次很棒:

  select * 
  from v_computed_consumption v 
  join dim_computed_meter cm 
      on (v.id_computed_meter = cm.id_computed_meter) 
  where cm.id_computed_meter =
             (select id_computed_meter 
              from dim_computed_meter 
              where computed_meter_name = 'General Water D') 
  order by ts desc;

但是这个解决方案不适用于我在 powerbi 中用于显示数据的星型架构结构。

这是第一个和第二个查询的解释计划。我知道第一个很好地利用了索引,第二个没有,但不知道为什么以及如何修复它。

谢谢,如果您已经走到这一步,任何帮助或提示都会很棒:)

注意:我确实使用 timescaledb 并且我的表是一个超表(您可能会在解释计划中看到它),但我将表转换为“常规”表并遇到了完全相同的问题,所以我很确定它不相关.

先说明计划(表现良好)

Sort  (cost=1305.72..1306.72 rows=400 width=77)
  Sort Key: fact_1.ts DESC
  ->  Nested Loop  (cost=1274.21..1288.43 rows=400 width=77)
        ->  Index Scan using dim_computed_meter_id_computed_meter_index on dim_computed_meter cm  (cost=0.27..2.49 rows=1 width=53)
              Index Cond: (id_computed_meter = 71)
        ->  HashAggregate  (cost=1273.94..1277.94 rows=400 width=24)
"              Group Key: fact_1.ts, params.id_site, params.id_computed_meter"
              ->  Nested Loop  (cost=0.98..1020.71 rows=25323 width=24)
                    ->  Nested Loop  (cost=0.55..5.87 rows=2 width=12)
                          ->  Index Only Scan using dim_computed_meter_id_computed_meter_index on dim_computed_meter cm_1  (cost=0.27..2.49 rows=1 width=8)
                                Index Cond: (id_computed_meter = 71)
                          ->  Index Scan using idx_params_id_computed_meter on param_computed_meters params  (cost=0.28..3.36 rows=2 width=12)
                                Index Cond: (id_computed_meter = 71)
                    ->  Append  (cost=0.42..406.00 rows=10142 width=20)
                          ->  Index Only Scan using _hyper_1_1_chunk_fact_consumption_id_meter_ts_index on _hyper_1_1_chunk fact_1  (cost=0.42..12.73 rows=326 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_2_chunk_fact_consumption_id_meter_ts_index on _hyper_1_2_chunk fact_2  (cost=0.42..29.72 rows=511 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_3_chunk_fact_consumption_id_meter_ts_index on _hyper_1_3_chunk fact_3  (cost=0.42..14.39 rows=515 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_4_chunk_fact_consumption_id_meter_ts_index on _hyper_1_4_chunk fact_4  (cost=0.42..17.87 rows=557 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_5_chunk_fact_consumption_id_meter_ts_index on _hyper_1_5_chunk fact_5  (cost=0.42..18.14 rows=541 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_6_chunk_fact_consumption_id_meter_ts_index on _hyper_1_6_chunk fact_6  (cost=0.42..20.96 rows=545 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_7_chunk_fact_consumption_id_meter_ts_index on _hyper_1_7_chunk fact_7  (cost=0.42..15.46 rows=545 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_8_chunk_fact_consumption_id_meter_ts_index on _hyper_1_8_chunk fact_8  (cost=0.42..14.86 rows=542 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_9_chunk_fact_consumption_id_meter_ts_index on _hyper_1_9_chunk fact_9  (cost=0.42..14.40 rows=547 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_10_chunk_fact_consumption_id_meter_ts_index on _hyper_1_10_chunk fact_10  (cost=0.42..14.43 rows=549 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_11_chunk_fact_consumption_id_meter_ts_index on _hyper_1_11_chunk fact_11  (cost=0.42..14.22 rows=537 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_12_chunk_fact_consumption_id_meter_ts_index on _hyper_1_12_chunk fact_12  (cost=0.42..24.57 rows=531 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_13_chunk_fact_consumption_id_meter_ts_index on _hyper_1_13_chunk fact_13  (cost=0.42..17.59 rows=541 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_14_chunk_fact_consumption_id_meter_ts_index on _hyper_1_14_chunk fact_14  (cost=0.42..14.72 rows=534 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_15_chunk_fact_consumption_id_meter_ts_index on _hyper_1_15_chunk fact_15  (cost=0.42..18.22 rows=514 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_16_chunk_fact_consumption_id_meter_ts_index on _hyper_1_16_chunk fact_16  (cost=0.42..16.59 rows=515 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_17_chunk_fact_consumption_id_meter_ts_index on _hyper_1_17_chunk fact_17  (cost=0.42..22.60 rows=513 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_18_chunk_fact_consumption_id_meter_ts_index on _hyper_1_18_chunk fact_18  (cost=0.42..14.42 rows=517 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_19_chunk_fact_consumption_id_meter_ts_index on _hyper_1_19_chunk fact_19  (cost=0.42..30.09 rows=501 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Index Only Scan using _hyper_1_20_chunk_fact_consumption_id_meter_ts_index on _hyper_1_20_chunk fact_20  (cost=0.42..8.27 rows=260 width=20)
                                Index Cond: (id_meter = params.id_meter)
                          ->  Seq Scan on _hyper_1_21_chunk fact_21  (cost=0.00..1.02 rows=1 width=20)
                                Filter: (params.id_meter = id_meter)

第二个解释计划(表现不佳):

Sort  (cost=971085.79..971086.29 rows=202 width=77)
  Sort Key: fact_10.ts DESC
  ->  Hash Join  (cost=968813.31..971078.05 rows=202 width=77)
        Hash Cond: (params.id_computed_meter = cm.id_computed_meter)
        ->  Finalize HashAggregate  (cost=968810.80..969810.80 rows=100000 width=24)
"              Group Key: fact_10.ts, params.id_site, params.id_computed_meter"
              ->  Gather  (cost=956810.80..967810.80 rows=100000 width=24)
                    Workers Planned: 1
                    ->  Partial HashAggregate  (cost=955810.80..956810.80 rows=100000 width=24)
"                          Group Key: fact_10.ts, params.id_site, params.id_computed_meter"
                          ->  Hash Join  (cost=128.33..747935.06 rows=20787574 width=24)
                                Hash Cond: (fact_10.id_meter = params.id_meter)
                                ->  Parallel Append  (cost=0.00..335620.98 rows=10896533 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_10_chunk fact_10  (cost=0.00..15472.43 rows=600843 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_9_chunk fact_9  (cost=0.00..15402.08 rows=598408 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_7_chunk fact_7  (cost=0.00..15353.64 rows=595564 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_13_chunk fact_13  (cost=0.00..15346.82 rows=593582 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_11_chunk fact_11  (cost=0.00..15267.98 rows=592698 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_6_chunk fact_6  (cost=0.00..15226.18 rows=590818 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_8_chunk fact_8  (cost=0.00..15206.71 rows=590471 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_12_chunk fact_12  (cost=0.00..15054.48 rows=583948 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_14_chunk fact_14  (cost=0.00..15017.74 rows=582974 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_5_chunk fact_5  (cost=0.00..14962.39 rows=579139 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_4_chunk fact_4  (cost=0.00..14694.30 rows=569030 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_18_chunk fact_18  (cost=0.00..14385.80 rows=558680 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_19_chunk fact_19  (cost=0.00..14321.08 rows=553408 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_17_chunk fact_17  (cost=0.00..14310.16 rows=550516 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_16_chunk fact_16  (cost=0.00..14110.21 rows=547621 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_15_chunk fact_15  (cost=0.00..13918.32 rows=539832 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_3_chunk fact_3  (cost=0.00..13428.45 rows=521645 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_2_chunk fact_2  (cost=0.00..13373.35 rows=518235 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_1_chunk fact_1  (cost=0.00..9161.19 rows=353219 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_20_chunk fact_20  (cost=0.00..7124.01 rows=275901 width=20)
                                      ->  Parallel Seq Scan on _hyper_1_21_chunk fact_21  (cost=0.00..1.01 rows=1 width=20)
                                ->  Hash  (cost=93.44..93.44 rows=2791 width=12)
                                      ->  Hash Join  (cost=40.14..93.44 rows=2791 width=12)
                                            Hash Cond: (params.id_computed_meter = cm_1.id_computed_meter)
                                            ->  Seq Scan on param_computed_meters params  (cost=0.00..45.91 rows=2791 width=12)
                                            ->  Hash  (cost=33.95..33.95 rows=495 width=8)
                                                  ->  Seq Scan on dim_computed_meter cm_1  (cost=0.00..33.95 rows=495 width=8)
        ->  Hash  (cost=2.49..2.49 rows=1 width=53)
              ->  Index Scan using dim_computed_meter_computed_meter_name_index_2 on dim_computed_meter cm  (cost=0.27..2.49 rows=1 width=53)
                    Index Cond: (computed_meter_name = 'General Water D'::text)

尝试设置SET enable_seqscan = OFF并设置enable_nestloop = false; 尝试分析和摆弄 random_page_cost 和 effective_cache_size

尝试从 dba 论坛获得帮助,但尚未成功。

就像上面解释的那样,这完美地总结了问题:

select * 
from v_computed_consumption2 v 
join dim_computed_meter cm 
      on (v.id_computed_meter = cm.id_computed_meter) 
where cm.id_computed_meter = 71 
order by ts desc; -- < 1 sec result
select * from v_computed_consumption2 v 
join dim_computed_meter cm 
      on (v.id_computed_meter = cm.id_computed_meter) 
where cm.id_computed_meter = 
        (select id_computed_meter 
         from dim_computed_meter 
         where computed_meter_name = 'General Water D') 
order by ts desc; -- < 1 sec result
select * from 
v_computed_consumption2 v 
join dim_computed_meter cm 
     on (v.id_computed_meter = cm.id_computed_meter) 
where cm.computed_meter_name = 'General Water D' 
order by ts desc; -- > 60 seconds result
database postgresql performance indexing timescaledb
1个回答
0
投票

请尝试这个

select * 
  from v_computed_consumption v 
  join dim_computed_meter cm 
      on (cm.computed_meter_name = 'General Water D' 
          and v.id_computed_meter = cm.id_computed_meter) 
  order by ts desc;
© www.soinside.com 2019 - 2024. All rights reserved.