我正在使用具有以下逻辑的 postgresql (15.5) 数据库:
我们有测量用电量的“仪表”。 我们有“计算机仪表”,可以计算不同仪表的测量值。例如“compatedMeterA”=“meterA”+“meterB”
我有经典的计量表“维度”(2500 条记录)和计算机计量表,其中包含 id、名称……以及包含每个计量表每小时数据的事实_消耗。 (约2000万条记录)
计算的仪表组合存储在参数表中,其结构如下:
create table param_computed_meters (
id integer not null
constraint param_computed_meters_pk
primary key,
id_computed_meter integer
constraint param_computed_meters_id_computed_meter_fk
references powerbi.dim_computed_meter
on delete cascade,
id_meter integer,
factor double precision );
我创建了一个视图,用于计算计算仪表的消耗量,如下所示:
create view v_computed_consumption
(ts, id_computed_meter, conso_with_factor, available_meters) as
SELECT fact.ts,
params.id_computed_meter,
sum(fact.gross_consumption * compteur.conversion_factor * params.factor) AS conso_with_factor
FROM param_computed_meters params
JOIN fact_consumption fact ON params.id_meter = fact.id_meter
JOIN dim_meter compteur ON params.id_meter = compteur.id_meter
JOIN dim_computed_meter cm ON cm.id_computed_meter = params.id_computed_meter
GROUP BY fact.ts, params.id_site, params.id_computed_meter;
所以这就是奇怪的时候:
当我像这样将我的视图(事实)加入到计算仪表维度时,我获得了出色的性能(<1 sec):
select *
from v_computed_consumption v
join dim_computed_meter cm
on (v.id_computed_meter = cm.id_computed_meter)
where cm.id_computed_meter = 71
order by ts desc;
当我过滤计算仪表的名称而不是 id 时,性能急剧下降(大约 1 分钟...):
select *
from v_computed_consumption v
join dim_computed_meter cm
on (v.id_computed_meter = cm.id_computed_meter)
where cm.computed_meter_name = 'General Water D'
order by ts desc;
所以我试着让它“不那么愚蠢”,性能再次很棒:
select *
from v_computed_consumption v
join dim_computed_meter cm
on (v.id_computed_meter = cm.id_computed_meter)
where cm.id_computed_meter =
(select id_computed_meter
from dim_computed_meter
where computed_meter_name = 'General Water D')
order by ts desc;
但是这个解决方案不适用于我在 powerbi 中用于显示数据的星型架构结构。
这是第一个和第二个查询的解释计划。我知道第一个很好地利用了索引,第二个没有,但不知道为什么以及如何修复它。
谢谢,如果您已经走到这一步,任何帮助或提示都会很棒:)
注意:我确实使用 timescaledb 并且我的表是一个超表(您可能会在解释计划中看到它),但我将表转换为“常规”表并遇到了完全相同的问题,所以我很确定它不相关.
先说明计划(表现良好)
Sort (cost=1305.72..1306.72 rows=400 width=77)
Sort Key: fact_1.ts DESC
-> Nested Loop (cost=1274.21..1288.43 rows=400 width=77)
-> Index Scan using dim_computed_meter_id_computed_meter_index on dim_computed_meter cm (cost=0.27..2.49 rows=1 width=53)
Index Cond: (id_computed_meter = 71)
-> HashAggregate (cost=1273.94..1277.94 rows=400 width=24)
" Group Key: fact_1.ts, params.id_site, params.id_computed_meter"
-> Nested Loop (cost=0.98..1020.71 rows=25323 width=24)
-> Nested Loop (cost=0.55..5.87 rows=2 width=12)
-> Index Only Scan using dim_computed_meter_id_computed_meter_index on dim_computed_meter cm_1 (cost=0.27..2.49 rows=1 width=8)
Index Cond: (id_computed_meter = 71)
-> Index Scan using idx_params_id_computed_meter on param_computed_meters params (cost=0.28..3.36 rows=2 width=12)
Index Cond: (id_computed_meter = 71)
-> Append (cost=0.42..406.00 rows=10142 width=20)
-> Index Only Scan using _hyper_1_1_chunk_fact_consumption_id_meter_ts_index on _hyper_1_1_chunk fact_1 (cost=0.42..12.73 rows=326 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_2_chunk_fact_consumption_id_meter_ts_index on _hyper_1_2_chunk fact_2 (cost=0.42..29.72 rows=511 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_3_chunk_fact_consumption_id_meter_ts_index on _hyper_1_3_chunk fact_3 (cost=0.42..14.39 rows=515 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_4_chunk_fact_consumption_id_meter_ts_index on _hyper_1_4_chunk fact_4 (cost=0.42..17.87 rows=557 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_5_chunk_fact_consumption_id_meter_ts_index on _hyper_1_5_chunk fact_5 (cost=0.42..18.14 rows=541 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_6_chunk_fact_consumption_id_meter_ts_index on _hyper_1_6_chunk fact_6 (cost=0.42..20.96 rows=545 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_7_chunk_fact_consumption_id_meter_ts_index on _hyper_1_7_chunk fact_7 (cost=0.42..15.46 rows=545 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_8_chunk_fact_consumption_id_meter_ts_index on _hyper_1_8_chunk fact_8 (cost=0.42..14.86 rows=542 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_9_chunk_fact_consumption_id_meter_ts_index on _hyper_1_9_chunk fact_9 (cost=0.42..14.40 rows=547 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_10_chunk_fact_consumption_id_meter_ts_index on _hyper_1_10_chunk fact_10 (cost=0.42..14.43 rows=549 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_11_chunk_fact_consumption_id_meter_ts_index on _hyper_1_11_chunk fact_11 (cost=0.42..14.22 rows=537 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_12_chunk_fact_consumption_id_meter_ts_index on _hyper_1_12_chunk fact_12 (cost=0.42..24.57 rows=531 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_13_chunk_fact_consumption_id_meter_ts_index on _hyper_1_13_chunk fact_13 (cost=0.42..17.59 rows=541 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_14_chunk_fact_consumption_id_meter_ts_index on _hyper_1_14_chunk fact_14 (cost=0.42..14.72 rows=534 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_15_chunk_fact_consumption_id_meter_ts_index on _hyper_1_15_chunk fact_15 (cost=0.42..18.22 rows=514 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_16_chunk_fact_consumption_id_meter_ts_index on _hyper_1_16_chunk fact_16 (cost=0.42..16.59 rows=515 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_17_chunk_fact_consumption_id_meter_ts_index on _hyper_1_17_chunk fact_17 (cost=0.42..22.60 rows=513 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_18_chunk_fact_consumption_id_meter_ts_index on _hyper_1_18_chunk fact_18 (cost=0.42..14.42 rows=517 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_19_chunk_fact_consumption_id_meter_ts_index on _hyper_1_19_chunk fact_19 (cost=0.42..30.09 rows=501 width=20)
Index Cond: (id_meter = params.id_meter)
-> Index Only Scan using _hyper_1_20_chunk_fact_consumption_id_meter_ts_index on _hyper_1_20_chunk fact_20 (cost=0.42..8.27 rows=260 width=20)
Index Cond: (id_meter = params.id_meter)
-> Seq Scan on _hyper_1_21_chunk fact_21 (cost=0.00..1.02 rows=1 width=20)
Filter: (params.id_meter = id_meter)
第二个解释计划(表现不佳):
Sort (cost=971085.79..971086.29 rows=202 width=77)
Sort Key: fact_10.ts DESC
-> Hash Join (cost=968813.31..971078.05 rows=202 width=77)
Hash Cond: (params.id_computed_meter = cm.id_computed_meter)
-> Finalize HashAggregate (cost=968810.80..969810.80 rows=100000 width=24)
" Group Key: fact_10.ts, params.id_site, params.id_computed_meter"
-> Gather (cost=956810.80..967810.80 rows=100000 width=24)
Workers Planned: 1
-> Partial HashAggregate (cost=955810.80..956810.80 rows=100000 width=24)
" Group Key: fact_10.ts, params.id_site, params.id_computed_meter"
-> Hash Join (cost=128.33..747935.06 rows=20787574 width=24)
Hash Cond: (fact_10.id_meter = params.id_meter)
-> Parallel Append (cost=0.00..335620.98 rows=10896533 width=20)
-> Parallel Seq Scan on _hyper_1_10_chunk fact_10 (cost=0.00..15472.43 rows=600843 width=20)
-> Parallel Seq Scan on _hyper_1_9_chunk fact_9 (cost=0.00..15402.08 rows=598408 width=20)
-> Parallel Seq Scan on _hyper_1_7_chunk fact_7 (cost=0.00..15353.64 rows=595564 width=20)
-> Parallel Seq Scan on _hyper_1_13_chunk fact_13 (cost=0.00..15346.82 rows=593582 width=20)
-> Parallel Seq Scan on _hyper_1_11_chunk fact_11 (cost=0.00..15267.98 rows=592698 width=20)
-> Parallel Seq Scan on _hyper_1_6_chunk fact_6 (cost=0.00..15226.18 rows=590818 width=20)
-> Parallel Seq Scan on _hyper_1_8_chunk fact_8 (cost=0.00..15206.71 rows=590471 width=20)
-> Parallel Seq Scan on _hyper_1_12_chunk fact_12 (cost=0.00..15054.48 rows=583948 width=20)
-> Parallel Seq Scan on _hyper_1_14_chunk fact_14 (cost=0.00..15017.74 rows=582974 width=20)
-> Parallel Seq Scan on _hyper_1_5_chunk fact_5 (cost=0.00..14962.39 rows=579139 width=20)
-> Parallel Seq Scan on _hyper_1_4_chunk fact_4 (cost=0.00..14694.30 rows=569030 width=20)
-> Parallel Seq Scan on _hyper_1_18_chunk fact_18 (cost=0.00..14385.80 rows=558680 width=20)
-> Parallel Seq Scan on _hyper_1_19_chunk fact_19 (cost=0.00..14321.08 rows=553408 width=20)
-> Parallel Seq Scan on _hyper_1_17_chunk fact_17 (cost=0.00..14310.16 rows=550516 width=20)
-> Parallel Seq Scan on _hyper_1_16_chunk fact_16 (cost=0.00..14110.21 rows=547621 width=20)
-> Parallel Seq Scan on _hyper_1_15_chunk fact_15 (cost=0.00..13918.32 rows=539832 width=20)
-> Parallel Seq Scan on _hyper_1_3_chunk fact_3 (cost=0.00..13428.45 rows=521645 width=20)
-> Parallel Seq Scan on _hyper_1_2_chunk fact_2 (cost=0.00..13373.35 rows=518235 width=20)
-> Parallel Seq Scan on _hyper_1_1_chunk fact_1 (cost=0.00..9161.19 rows=353219 width=20)
-> Parallel Seq Scan on _hyper_1_20_chunk fact_20 (cost=0.00..7124.01 rows=275901 width=20)
-> Parallel Seq Scan on _hyper_1_21_chunk fact_21 (cost=0.00..1.01 rows=1 width=20)
-> Hash (cost=93.44..93.44 rows=2791 width=12)
-> Hash Join (cost=40.14..93.44 rows=2791 width=12)
Hash Cond: (params.id_computed_meter = cm_1.id_computed_meter)
-> Seq Scan on param_computed_meters params (cost=0.00..45.91 rows=2791 width=12)
-> Hash (cost=33.95..33.95 rows=495 width=8)
-> Seq Scan on dim_computed_meter cm_1 (cost=0.00..33.95 rows=495 width=8)
-> Hash (cost=2.49..2.49 rows=1 width=53)
-> Index Scan using dim_computed_meter_computed_meter_name_index_2 on dim_computed_meter cm (cost=0.27..2.49 rows=1 width=53)
Index Cond: (computed_meter_name = 'General Water D'::text)
尝试设置SET enable_seqscan = OFF并设置enable_nestloop = false; 尝试分析和摆弄 random_page_cost 和 effective_cache_size
尝试从 dba 论坛获得帮助,但尚未成功。
就像上面解释的那样,这完美地总结了问题:
select *
from v_computed_consumption2 v
join dim_computed_meter cm
on (v.id_computed_meter = cm.id_computed_meter)
where cm.id_computed_meter = 71
order by ts desc; -- < 1 sec result
select * from v_computed_consumption2 v
join dim_computed_meter cm
on (v.id_computed_meter = cm.id_computed_meter)
where cm.id_computed_meter =
(select id_computed_meter
from dim_computed_meter
where computed_meter_name = 'General Water D')
order by ts desc; -- < 1 sec result
select * from
v_computed_consumption2 v
join dim_computed_meter cm
on (v.id_computed_meter = cm.id_computed_meter)
where cm.computed_meter_name = 'General Water D'
order by ts desc; -- > 60 seconds result
请尝试这个
select *
from v_computed_consumption v
join dim_computed_meter cm
on (cm.computed_meter_name = 'General Water D'
and v.id_computed_meter = cm.id_computed_meter)
order by ts desc;