我注意到,当我添加访问日期或按 ib.id 排序等过滤器时,查询执行时间会显着增加。执行大约需要2分钟。这是代码:
SELECT *
FROM (
SELECT ib.id,
ben.uid,
ib.pid,
ben.first_name,
ben.middle_name,
ben.last_name,
iv.visit_date,
ben.id AS beneficiary_id,
ben.mobile_number,
ben.age,
iv.is_pregnant,
itr.tested_date,
ben.hiv_status_id AS hiv_status,
ben.hiv_type_id AS hiv_type,
ben.date_of_birth,
itr.result_status,
ib.recent_visit_id AS visit_id,
ib.beneficiary_status,
ben.gender_id,
ib.is_active AS ictc_ben_is_active,
ib.is_deleted AS ictc_ben_is_deleted,
ib.deleted_reason AS ictc_ben_deleted_reason,
ib.deleted_reason_comment AS ictc_ben_deleted_reason_comment,
ib.facility_id AS registered_facility_id,
ibs.name AS beneficiary_status_desc,
hs.name AS hiv_status_desc,
ib.infant_code as infant_code
FROM soch.ictc_beneficiary ib
JOIN soch.beneficiary ben ON ib.beneficiary_id = ben.id
LEFT JOIN soch.ictc_test_result itr ON ib.current_test_result_id = itr.id
LEFT JOIN soch.ictc_visit iv ON ib.recent_visit_id = iv.id
LEFT JOIN soch.master_hiv_status hs ON ben.hiv_status_id = hs.id
LEFT JOIN soch.master_ictc_beneficiary_status ibs ON ib.beneficiary_status = ibs.id
where ib.facility_id = 13649 and ben.category_id <>1
and ben.is_active = true --and (ben.is_delete = false or ben.is_delete = null)
and (ben.benf_search_str like '%%' or ib.pid like '%%')
) AS ordered_data where ordered_data.visit_date >= '2024-04-15'
limit 10;
这里是查询的解释分析:
Node Type Entity Cost Rows Time Condition
Limit [NULL] 2.26 - 2784.90 10 57874.052 [NULL]
Nested Loop [NULL] 2.26 - 33393.90 10 57874.037 [NULL]
Nested Loop [NULL] 2.26 - 33287.67 10 57873.995 [NULL]
Nested Loop [NULL] 2.26 - 33278.41 10 57873.941 [NULL]
Nested Loop [NULL] 1.69 - 32947.84 10 57686.039 [NULL]
Nested Loop [NULL] 1.13 - 32613.39 10 57599.678 [NULL]
Index Scan ictc_beneficiary 0.56 - 9424.30 7068 14378.517 (facility_id = 13649)
Index Scan ictc_visit 0.56 - 2.76 0 6.114 (id = ib.recent_visit_id)
Index Scan beneficiary 0.56 - 2.78 1 8.631 (id = ib.beneficiary_id)
Index Scan ictc_test_result 0.56 - 2.75 1 18.786 (id = ib.current_test_result_id)
Materialize [NULL] 0.00 - 1.07 1 0.003 [NULL]
Seq Scan master_hiv_status 0.00 - 1.05 1 0.014 [NULL]
Materialize [NULL] 0.00 - 1.88 3 0.002 [NULL]
Seq Scan master_ictc_beneficiary_status 0.00 - 1.59 3 0.007 [NULL]
我在facility_id 和visit_data 都有索引。任何想法,我已经为此苦苦挣扎很多天了
当需要提高查询速度时,DDL、索引就很重要。
例如表格
soch.beneficiary
索引中是否可以与 WHERE 子句中的表达式一起使用:
ben.is_active = true
当不存在这样的索引时,
soch.beneficiary
的所有记录都需要连接到结果中,并且只有在这样做之后才能进行过滤(因为只有这样才能知道字段is_active
的结果值)。索引应包含 is_active
和 id
字段!)
另一个技巧,为了获得更好的可读性,可以从 WHERE 子句中删除表达式并将其添加到 ON 子句中,因此对于表
soch.beneficiary
它将导致:
SELECT
...
FROM soch.ictc_beneficiary ib
JOIN soch.beneficiary ben ON ib.beneficiary_id = ben.id AND ben.is_active = true
下一个是:
ben.benf_search_str like '%%'
我觉得
ben.benf_search_str not is null
会更快? (同样,benf_search_str
上的索引可能会改善这一点)
改进是一个循序渐进的过程,需要小心谨慎地进行。通过逐步执行此操作,您将了解如何提高总体查询速度,这对于下一个需要改进的 SQL 语句很有帮助。
最后是输出(显然是 DBeaver):输入查询时:
EXPLAIN (analyze,verbose, buffers, settings) SELECT .....
您不应该将其复制为图片,但在结果视图中您应该看到“网格”和“文本”,选择“文本”时您可以复制/粘贴文本,这比图片更具可读性。