我有一个运行缓慢的查询。我非常确定瓶颈是计划中的顺序扫描,因此我想建立适当的索引和/或重新排列我的查询以对此进行改进。
这是我的查询(和here is a fiddle with a schema and test data):
SELECT conversations.id, max(messages.timestamp) as latest_message FROM
conversations JOIN messages on conversations.id = messages.cid
WHERE conversations.userid=1
GROUP BY conversations.id ORDER BY latest_message;
我已经在所有涉及的列上建立了索引,并且在两个方向上都在cid
和timestamp
上嵌套了索引,但都无济于事。顺序扫描仍然:
Sort (cost=200.60..200.65 rows=20 width=12)
Sort Key: (max(messages."timestamp"))
-> HashAggregate (cost=199.97..200.17 rows=20 width=12)
Group Key: conversations.id
-> Hash Join (cost=11.50..197.97 rows=400 width=12)
Hash Cond: (messages.cid = conversations.id)
-> Seq Scan on messages (cost=0.00..160.00 rows=10000 width=12)
-> Hash (cost=11.25..11.25 rows=20 width=4)
-> Seq Scan on conversations (cost=0.00..11.25 rows=20 width=4)
Filter: (userid = 10)
如何改善此查询和/或可以建立哪些索引来修复这些顺序扫描?
对于这个问题的版本,我建议:
SELECT c.id,
(SELECT max(m.timestamp)
FROM messages m
WHERE c.id = m.cid
) as latest_message
FROM conversations c
WHERE c.userid = 1
ORDER BY latest_message;
您要在conversations(userid, cid)
和messages(cid, timestamp)
上建立索引。