PostgreSQL 15.2 (Debian 15.2-1.pgdg110+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
我通过这种方式迁移创建了一个 JSONB 列:
sequelize
我的行一般都是这样的结构:
queryInterface.addColumn('Entities', 'summary', {
type: Sequelize.DataTypes.JSONB,
})
我需要一个查询来将参数与 JSONB 摘要的属性进行比较:
{
summary: {
foobar: 123.456
}
}
Sequelize 生成的查询:
where: {
summary: {
foobar: {
[Sequelize.Op.lt]: parseFloat(maxFoobar),
},
},
}
对于 150,000 行,这需要 300-400 毫秒。
SELECT "id", "name", "type", "geometry", "summary", "createdAt", "updatedAt"
FROM "Entities" AS "Entity"
WHERE CAST(("Entity"."summary"#>>'{foobar}') AS DOUBLE PRECISION) < 64
ORDER BY "Entity"."createdAt" DESC
LIMIT 100
OFFSET 0;
如何在该 JSONB 键上构建索引来运行查询?
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=29685.31..29696.98 rows=100 width=1475) (actual time=81.465..85.319 rows=100 loops=1)
Output: id, name, type, geometry, summary, "createdAt", "updatedAt"
Buffers: shared hit=91259 read=17301
-> Gather Merge (cost=29685.31..34772.56 rows=43602 width=1475) (actual time=81.461..85.310 rows=100 loops=1)
Output: id, name, type, geometry, summary, "createdAt", "updatedAt"
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=91259 read=17301
-> Sort (cost=28685.29..28739.79 rows=21801 width=1475) (actual time=76.879..76.889 rows=50 loops=3)
Output: id, name, type, geometry, summary, "createdAt", "updatedAt"
Sort Key: "Entity"."createdAt" DESC
Sort Method: top-N heapsort Memory: 309kB
Buffers: shared hit=91259 read=17301
Worker 0: actual time=75.047..75.062 rows=65 loops=1
Sort Method: top-N heapsort Memory: 353kB
Buffers: shared hit=29140 read=5589
Worker 1: actual time=75.044..75.055 rows=63 loops=1
Sort Method: top-N heapsort Memory: 351kB
Buffers: shared hit=29074 read=5532
-> Parallel Seq Scan on public."Entities" "Entity" (cost=0.00..27852.07 rows=21801 width=1475) (actual time=0.184..68.223 rows=20705 loops=3)
Output: id, name, type, geometry, summary, "createdAt", "updatedAt"
Filter: ((("Entity".summary #>> '{foobar}'::text[]))::double precision < '64'::double precision)
Rows Removed by Filter: 31618
Buffers: shared hit=91145 read=17301
Worker 0: actual time=0.240..66.849 rows=20034 loops=1
Buffers: shared hit=29083 read=5589
Worker 1: actual time=0.081..66.737 rows=19791 loops=1
Buffers: shared hit=29017 read=5532
Planning:
Buffers: shared hit=40
Planning Time: 0.815 ms
上的索引无疑是一个不错的选择。 Postgres 会快速找到 100 行来满足查询。检查 JSON 文档的嵌套值总是会增加成本(特别是当值很大时)。
这个定制的部分索引降低了此成本:"createdAt"
添加(冗余)生成的列,以及基于该列的索引将有更多帮助:
CREATE INDEX ON "Entities" ("createdAt" DESC)
WHERE (summary->>'foobar')::float8 < 64;
参见:
ALTER TABLE "Entities"
ADD COLUMN foobar float8 GENERATED ALWAYS AS ((summary->>'foobar')::float8) STORED;
CREATE INDEX ON "Entities" ("createdAt" DESC)
WHERE foobar < 64;