我有一个 mongodb 查询,并且已经创建了所有必需的索引。但查询仍然扫描集合中的所有文档。
以下是查询:
[
{
"$group": {
"driverStatuses": {
"$addToSet": "$driverStatus"
},
"batchAddedAt": {
"$first": "$batchAddedAt"
},
"cityId": {
"$first": "$operationRegion"
},
"pickLat": {
"$first": "$pickLat"
},
"pickLng": {
"$first": "$pickLng"
},
"_id": "$batchId"
}
},
{
"$match": {
"batchAddedAt": {
"$gte": 1705319256
}
}
},
{
"$match": {
"driverStatuses": {
"$nin": [
"COMPLETED",
"CANCELLED",
"START_PICKUP",
"ACCEPTED",
"WAITING"
]
}
}
},
{
"$sort": {
"_id": 1
}
}
]
它的日志是:
{
"t": {
"$date": "2024-01-16T11:47:41.418+00:00"
},
"s": "I",
"c": "COMMAND",
"id": 51803,
"ctx": "conn486616",
"msg": "Slow query",
"attr": {
"type": "command",
"ns": "dms.batchAssignmentActivity",
"command": {
"aggregate": "batchAssignmentActivity",
"pipeline": [
{
"$group": {
"driverStatuses": {
"$addToSet": "$driverStatus"
},
"batchAddedAt": {
"$first": "$batchAddedAt"
},
"cityId": {
"$first": "$operationRegion"
},
"pickLat": {
"$first": "$pickLat"
},
"pickLng": {
"$first": "$pickLng"
},
"_id": "$batchId"
}
},
{
"$match": {
"batchAddedAt": {
"$gte": 1705319256
}
}
},
{
"$match": {
"driverStatuses": {
"$nin": [
"COMPLETED",
"CANCELLED",
"START_PICKUP",
"ACCEPTED",
"WAITING"
]
}
}
},
{
"$sort": {
"_id": 1
}
}
],
"cursor": {},
"lsid": {
"id": {
"$uuid": "a1ecf292-f2a0-4ce9-8520-a66f86fd3a80"
}
},
"$clusterTime": {
"clusterTime": {
"$timestamp": {
"t": 1705405656,
"i": 32
}
},
"signature": {
"hash": {
"$binary": {
"base64": "JyBotKPMVXqaekfqPtz6E4RwE20=",
"subType": "0"
}
},
"keyId": 7278270623686590000
}
},
"$db": "dms",
"$readPreference": {
"mode": "primary"
}
},
"planSummary": "COLLSCAN",
"keysExamined": 0,
"docsExamined": 276938,
"hasSortStage": true,
"cursorExhausted": true,
"numYields": 367,
"nreturned": 15,
"queryHash": "09C9423E",
"queryFramework": "sbe",
"reslen": 2963,
"locks": {
"FeatureCompatibilityVersion": {
"acquireCount": {
"r": 369
}
},
"Global": {
"acquireCount": {
"r": 369
}
},
"Mutex": {
"acquireCount": {
"r": 2
}
}
},
"readConcern": {
"level": "local",
"provenance": "implicitDefault"
},
"writeConcern": {
"w": 1,
"wtimeout": 0,
"provenance": "customDefault"
},
"storage": {
"data": {
"bytesRead": 203982595,
"timeReadingMicros": 54403
}
},
"remote": "10.20.194.26:60606",
"protocol": "op_msg",
"durationMillis": 4453
}
}
我真的不知道原因。为什么这里不使用索引?
在屏幕截图中分享索引详细信息。
我已经创建了索引,但仍在扫描集合的所有文档。预计查询应使用索引。
对于聚合,索引仅用于管道的第一阶段(例如
$match
、$sort
)。对于所讨论的聚合,第一阶段是 $group
阶段,它是阻塞阶段。这意味着在管道可以继续之前必须知道所有文档。这导致不使用索引的查询速度很慢。
在不了解文档结构和查询目标的情况下,很难提供解决方案。因此,请根据您的需求调整以下示例。
docs 的一个重要指南是,如果索引是
$match
、$sort
,也许还有 $group
,则可以在管道的第一阶段使用索引。
在您的示例中,您可以在开始时应用
$match
,然后按 batchId
对文档进行分组(为了简洁起见,我省略了几个属性,但这应该会给您一个想法):
[
{
$match: {
batchAddedAt: { $gte: 1705319256 },
},
},
{
$group: {
_id: "$batchId",
driverStatuses: {
$addToSet: "$driverStatus",
},
batchAddedAt: {
$first: "$batchAddedAt",
},
cityId: { $first: "$operationRegion" },
},
},
{
$match: {
driverStatuses: {
$nin: [
"COMPLETED",
"CANCELLED",
"START_PICKUP",
"ACCEPTED",
"WAITING",
],
},
},
},
{
$sort: { _id: 1 }
}
]
这可以通过指数
batchAddedAt_1
来支持。由于开头的$match
,减少了通过管道推送的文档量; $sort
和 $group
也受到指数的支持。请根据需要调整示例。