我有 mongo
documents
,如下所示,想要执行聚合以查找每个唯一联系人的错误代码计数。
{
"_id": {
"$oid": ""
},
"campId": "61baef7817cd8ff66518", //camp1
"contactId": "61aa6fbf77490b0007714273", // contact 1
"title": "Happy Holidays!",
"communicationType": "EMAIL",
"contactedOnTime": {
"$numberLong": "1695182032" // AT TIME1
},
"communicationValidationError": "EMAIL_ADDRESS_NOT_PRESENT"
}
{
"_id": {
"$oid": ""
},
"campId": "61baef7817cd8ff66518", //camp1
"contactId": "61aa6fbf77490b0007714273", // contact1
"title": "Happy Holidays!",
"communicationType": "EMAIL",
"contactedOnTime": {
"$numberLong": "1695182074" // AT TIME2
},
"communicationValidationError": "EMAIL_ADDRESS_NOT_PRESENT"
}
{
"_id": {
"$oid": ""
},
"campId": "61baef7817cd8ff66518", // camp1
"contactId": "61aa6fbf77490b0007714274", // contact2
"title": "Happy Holidays!",
"communicationType": "EMAIL",
"contactedOnTime": {
"$numberLong": "1695182059"
},
"communicationValidationError": "EMAIL_BOUNCED"
}
我在下面尝试过,但这并不能消除营地的重复联系人,并且显示计数为 2 而不是 1。我需要为唯一联系人选择最新的 communicationsValidationError 并将其获取为campId 的 communicationsValidationError 的
totalcounts
。
db.myCollection.aggregate([
{ $project: { _id: 0, campId: 1, contactId: 1, communicationValidationError: 1 } },
{ $sort: { "contactedOnTime.numberLong":-1}},
{ $match: { campId: '61baef7817cd8ff66518' } },
{ $group: { _id: { communicationValidationError: '$communicationValidationError' }, totalErrors: { $sum: 1 } } }
]).pretty()
您需要 2
$group
阶段。
同时,我对您的查询进行了如下优化:
$match
- 过滤文档。 $match
阶段应放置在第一个阶段以利用索引(如果应用)。不需要 $project
阶段,因为它是不必要的。
$group
- 按 communicationValidationError
和 campId
分组。并通过 contactedOnTime
运算符获取最新的 $max
。
$group
- 按 _id.communicationValidationError
分组并执行计数。
db.collection.aggregate([
{
$match: {
campId: "61baef7817cd8ff66518"
}
},
{
$group: {
_id: {
communicationValidationError: "$communicationValidationError",
campId: "$campId"
},
latestContactedOnTime: {
$max: "$contactedOnTime"
}
}
},
{
$group: {
_id: {
communicationValidationError: "$_id.communicationValidationError"
},
totalErrors: {
$sum: 1
}
}
}
])