我正在尝试查询总共 46M 行,但仍然不确定该数据使用什么索引,这里我的表如下所示;
CREATE TABLE IF NOT EXISTS "data" (
"deviceId" UUID,
"eventDate" DATE,
"eventHour" SMALLINT,
"eventTime" TIME(0) WITHOUT TIME ZONE,
"point" GEOMETRY(POINT, 4326),
"city_code" INTEGER,
"county_code" INTEGER,
"district_code" INTEGER,
"duration" INTEGER,
"deviceType" TEXT,
"cell" H3INDEX,
"yas" SMALLINT,
"ses" SMALLINT,
PRIMARY KEY ("deviceId", "eventDate", "eventTime")
);
我不需要按设备 ID 过滤行,我主要希望按
cell
过滤行,单元格表示该点所在的单元格,在地图上看起来像这样:
单元格基本上是文本数据类型,具有类似于
8c2d1c68a2d07ff
的值和固定长度。
基本上,我需要按单元格对行(点)进行分组,并按
yas
、ses
和 eventTime (in hourly basis and between date ranges)
过滤它们。 yas
和ses
是数据的类别类型,整数将被限制为仅1-10代表不同的类别。我已经尝试过这些索引,但计算 46M 行需要超过 1 秒的时间,并且行数将多出 10 倍:
CREATE INDEX IF NOT EXISTS "data_cell" ON "data" ("cell") WITH (FILLFACTOR = 100);
CREATE INDEX IF NOT EXISTS "data_ses_cell" ON "data" ("ses", "cell") WITH (FILLFACTOR = 100);
CREATE INDEX IF NOT EXISTS "data_yas_cell" ON "data" ("yas", "cell") WITH (FILLFACTOR = 100);
CREATE INDEX IF NOT EXISTS "data_date_cell" ON "data" ("eventDate", "cell") WITH (FILLFACTOR = 100);
CREATE INDEX IF NOT EXISTS "data_date_cell" ON "data" ("eventTime", "cell") WITH (FILLFACTOR = 100);
CREATE INDEX IF NOT EXISTS "data_date_cell" ON "data" ("eventHour", "cell") WITH (FILLFACTOR = 100);
CREATE INDEX IF NOT EXISTS "data_date_time_cell" ON "data" ("eventHour", "eventDate", "cell") WITH (FILLFACTOR = 100);
这是一个示例查询和查询规划器结果:
EXPLAIN ANALYZE
WITH ref AS (
SELECT refcell, h3_cell_to_children(refcell, 12) node
FROM (
SELECT h3_polygon_to_cells(ST_SetSRID(
ST_MakeBox2D(
ST_Point(28.93155097961426, 40.97555652808213),
ST_Point(29.058237075805668, 41.029513890837386)
), 4326
), 8) refcell
) as cells
), filtered AS (
SELECT cell, count(*)
FROM data
WHERE
"cell" IN (SELECT node FROM ref) AND
"eventDate" BETWEEN '2023-01-01' AND '2023-02-01' AND
"ses" = ANY(ARRAY[0]) AND
"yas" = ANY(ARRAY[0,1,2])
GROUP BY cell
)
SELECT refcell, sum(count)
FROM (
SELECT refcell, node, count
FROM ref, filtered
WHERE cell = ref.node
) as t
GROUP BY refcell;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=500785.48..500787.98 rows=200 width=40) (actual time=1322.053..1322.064 rows=60 loops=1)
Group Key: ref.refcell
Batches: 1 Memory Usage: 48kB
CTE ref
-> ProjectSet (cost=0.00..5022.77 rows=1000000 width=16) (actual time=173.051..184.126 rows=187278 loops=1)
-> ProjectSet (cost=0.00..5.27 rows=1000 width=8) (actual time=173.043..173.108 rows=78 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=172.681..172.682 rows=1 loops=1)
-> Hash Join (cost=159082.09..181762.72 rows=62800000 width=16) (actual time=1297.852..1321.680 rows=3705 loops=1)
Hash Cond: (ref.node = filtered.cell)
-> CTE Scan on ref (cost=0.00..20000.00 rows=1000000 width=16) (actual time=173.053..186.193 rows=187278 loops=1)
-> Hash (cost=158925.09..158925.09 rows=12560 width=16) (actual time=1124.736..1124.737 rows=3705 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 302kB
-> Subquery Scan on filtered (cost=158673.89..158925.09 rows=12560 width=16) (actual time=1123.696..1124.310 rows=3705 loops=1)
-> HashAggregate (cost=158673.89..158799.49 rows=12560 width=16) (actual time=1123.694..1124.116
rows=3705 loops=1)
Group Key: data.cell
Batches: 1 Memory Usage: 913kB
-> Nested Loop (cost=22500.56..156950.74 rows=344630 width=8) (actual time=91.799..1111.033 rows=91725 loops=1)
-> HashAggregate (cost=22500.00..22502.00 rows=200 width=8) (actual time=91.473..137.551 rows=187278 loops=1)
Group Key: ref_1.node
Batches: 5 Memory Usage: 11073kB Disk Usage: 3472kB
-> CTE Scan on ref ref_1 (cost=0.00..20000.00 rows=1000000 width=8) (actual tim
e=0.001..42.754 rows=187278 loops=1)
-> Index Scan using data_ses_cell on data (cost=0.56..671.69 rows=55 width=8) (actual time=0.001..0.005 rows=0 loops=187278)
Index Cond: ((ses = ANY ('{0}'::integer[])) AND (cell = ref_1.node))
Filter: (("eventDate" >= '2023-01-01'::date) AND ("eventDate" <= '2023-02-01'::date) AND (yas = ANY ('{0,1,2}'::integer[])))
Rows Removed by Filter: 1
Planning Time: 0.273 ms
JIT:
Functions: 41
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 2.609 ms, Inlining 7.226 ms, Optimization 107.961 ms, Emission 62.515 ms, Total 180.311 ms
Execution Time: 1325.916 ms
(31 rows)
服务器规格为
64GB Ram
、Ryzen 5 3600
和 512GB NVMe ssd
。
我至少需要在500ms的时间内执行这种查询。可以吗?
对于这么多行,我应该首先继续使用 PostgreSQL 吗?或者如果 postgresql 可以处理那么多数据点,我做错了什么?
谢谢!
编辑 我稍微改变了结构,这是最终状态:
新索引:
CREATE INDEX IF NOT EXISTS "data_ses_cell" ON "data" ("cell", "ses", "yas") WITH (FILLFACTOR = 100);
CREATE INDEX IF NOT EXISTS "data_date_time_cell" ON "data" USING BRIN ("eventTime") WITH (FILLFACTOR = 100);
EXPLAIN (ANALYZE, BUFFERS) WITH ref AS (
SELECT refcell, h3_cell_to_children(refcell, 12) node
FROM (
SELECT h3_polygon_to_cells(ST_SetSRID(
ST_MakeBox2D(
ST_Point(28.87567520141602, 40.95903013727966),
ST_Point(29.12904739379883, 41.06692773019345)
), 4326
), 9) refcell
) as cells
), filtered AS (
SELECT cell, count(*)
FROM data
WHERE "cell" IN (SELECT node FROM ref) AND "eventTime"::DATE BETWEEN '2023-01-01' AND '2023-01-01' AND "ses" = ANY(ARRAY[0]) AND "yas" = ANY(ARRAY[1,2,3])
GROUP BY cell
)
SELECT refcell, sum(count)
FROM (
SELECT refcell, node, count
FROM ref, filtered
WHERE cell = ref.node
) as t
GROUP BY refcell;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=144023.36..144025.86 rows=200 width=40) (actual time=4356.905..4356.967 rows=431 loops=1)
Group Key: ref.refcell
Batches: 1 Memory Usage: 173kB
Buffers: shared hit=8999147 read=304826 dirtied=32 written=11724, temp read=6321 written=8197
I/O Timings: shared/local read=476.680 write=29.652, temp read=7.378 write=24.451
CTE ref
-> ProjectSet (cost=0.00..5022.77 rows=1000000 width=16) (actual time=1.766..48.559 rows=750827 loops=1)
-> ProjectSet (cost=0.00..5.27 rows=1000 width=8) (actual time=1.764..2.680 rows=2189 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.001 rows=1 loops=1)
-> Hash Join (cost=72494.97..95175.59 rows=8765000 width=16) (actual time=4245.857..4356.724 rows=858 loops=1)
Hash Cond: (ref.node = filtered.cell)
Buffers: shared hit=8999147 read=304826 dirtied=32 written=11724, temp read=6321 written=8197
I/O Timings: shared/local read=476.680 write=29.652, temp read=7.378 write=24.451
-> CTE Scan on ref (cost=0.00..20000.00 rows=1000000 width=16) (actual time=1.767..54.292 rows=750827 loops=1)
Buffers: temp read=2383 written=1
I/O Timings: temp read=2.751 write=0.028
-> Hash (cost=72473.06..72473.06 rows=1753 width=16) (actual time=4244.035..4244.037 rows=858 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 57kB
Buffers: shared hit=8999147 read=304826 dirtied=32 written=11724, temp read=3938 written=8196
I/O Timings: shared/local read=476.680 write=29.652, temp read=4.627 write=24.423
-> Subquery Scan on filtered (cost=72424.85..72473.06 rows=1753 width=16) (actual time=4243.319..4243.933 rows=858 loops=1)
Buffers: shared hit=8999147 read=304826 dirtied=32 written=11724, temp read=3938 written=8196
I/O Timings: shared/local read=476.680 write=29.652, temp read=4.627 write=24.423
-> GroupAggregate (cost=72424.85..72455.53 rows=1753 width=16) (actual time=4243.318..4243.873 rows=858 loops=1)
Group Key: data.cell
Buffers: shared hit=8999147 read=304826 dirtied=32 written=11724, temp read=3938 written=8196
I/O Timings: shared/local read=476.680 write=29.652, temp read=4.627 write=24.423
-> Sort (cost=72424.85..72429.23 rows=1753 width=8) (actual time=4243.313..4243.460 rows=4456 loops=1)
Sort Key: data.cell
Sort Method: quicksort Memory: 193kB
Buffers: shared hit=8999147 read=304826 dirtied=32 written=11724, temp read=3938 written=8196
I/O Timings: shared/local read=476.680 write=29.652, temp read=4.627 write=24.423
-> Nested Loop (cost=22500.56..72330.40 rows=1753 width=8) (actual time=330.563..4242.929 rows=4456 loops=1)
Buffers: shared hit=8999147 read=304826 dirtied=32 written=11724, temp read=3938 written=8196
I/O Timings: shared/local read=476.680 write=29.652, temp read=4.627 write=24.423
-> HashAggregate (cost=22500.00..22502.00 rows=200 width=8) (actual time=326.702..602.036 rows=750827 loops=1)
Group Key: ref_1.node
Batches: 21 Memory Usage: 11073kB Disk Usage: 19600kB
Buffers: temp read=3938 written=8196
I/O Timings: temp read=4.627 write=24.423
-> CTE Scan on ref ref_1 (cost=0.00..20000.00 rows=1000000 width=8) (actual time=0.000..183.769 rows=750827 loops=1)
Buffers: temp written=2382
I/O Timings: temp write=9.866
-> Index Scan using data_ses_cell on data (cost=0.56..249.13 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=750827)
Index Cond: ((cell = ref_1.node) AND (ses = ANY ('{0}'::integer[])) AND (yas = ANY ('{1,2,3}'::integer[])))
Filter: ((("eventTime")::date >= '2023-01-01'::date) AND (("eventTime")::date <= '2023-01-01'::date))
Rows Removed by Filter: 0
Buffers: shared hit=8999147 read=304826 dirtied=32 written=11724
I/O Timings: shared/local read=476.680 write=29.652
Planning:
Buffers: shared hit=29
Planning Time: 0.305 ms
Execution Time: 4361.263 ms
(53 rows)
这里有两种一般方法,使用不同的计划形状,或者使现有计划更快。
查看您的最新计划,您将索引“data_ses_cell”踢了 750,827 次,这样做占用了您的大部分执行时间。但它认为自己只会踢它200次。如果它正确地知道必须多久踢一次索引,也许它会决定将数据从“数据”批发到哈希表中,然后使用哈希连接,而不是在嵌套循环中踢零售索引.
不幸的是我不知道有什么方法可以修复底部 HashAgg 的估计。 200 是规划器在不知道聚合中将出现多少个组时使用的默认估计,但它仍然需要选择一个数字。
您可以通过设置enable_nestloop = off来强制它使用Hash Join。但为了提高效率,您可能需要不同的索引。您希望它快速应用除“单元格”之外的所有条件,因为“单元格”最终将驱动哈希表,因此它不再是索引扫描的条件。所以你需要一个像
("ses", "yas","eventTime","cell")
这样的索引。将“单元格”包含为最后一列应该启用仅索引扫描。
如果您让它使用此计划与enable_nestloop = off和新索引,并且速度更快,那么您就没有非常理想的答案,并且您知道要调查什么(估计200行)以将其变成理想的一个。
罪魁祸首索引扫描每次循环都必须访问 (8999147+304826)/750827 = 12.4 个缓冲区。这是很多,但尚不清楚原因。一种可能性是索引中有很多条目不符合“eventTime”条件,因此只有在访问某些缓冲区以调查它们之后才会被删除。但这实际上没有意义,因为在这种情况下,您会期望“由过滤器删除的行”要大得多,例如大约 8 而不是 0(因为返回 0 或 1 行的索引扫描的一次迭代通常预计会仅访问大约 4 个缓冲区,其中 3 个用于降序索引,1 个用于查询表)。我能想到的唯一可以解释这一点的其他事情是索引中充满了指向过时元组的指针。然后,指针必须跟随到表页(导致缓冲区访问),但被拒绝的行不会被计为“被过滤器删除”,因为它们是通过可见性检查而不是过滤器删除的。
如果“eventTime”条件删除了很多元组,但它们没有被计入“过滤器删除的行”(出于我不明白的原因),那么解决方案是将“eventTime”添加到现有索引作为最后一列,它允许删除这些行而无需查阅表。另一方面,如果索引中有很多过时的元组,解决方案是确保没有打开长期存在的事务,然后在再次运行查询之前立即 VACUUM 表。