是否存在与使用单个 MongoDB 聚合的以下 SQL 语句等效的 MongoDB 聚合查询? (JOIN 的两个操作数是两个也包含连接操作的子查询。)
SELECT
T1.C1, T2.C2
FROM
(SELECT
T3.C1 C1, T4.C2 C2
FROM
T3
JOIN
T4 ON T3.C0 = T4.C0) T1
JOIN
(SELECT
T5.C1 C1, T6.C2 C2
FROM
T5
JOIN
T6 ON T5.C0 = T6.C0) T2 ON T1.C1 = T2.C1
AND T1.C2 = T2.C2;
简短的回答 - 不,lookup要求“from”是一个集合。
正如 WD 所指出的,SQL 模式在 NoSQL 世界中不起作用,但数据库就是数据库 - 如果数据在那里,你总是可以获取它。逻辑将被颠倒 - 你会做
SELECT T3.C1, T6.C2 FROM T3 and complex lookups/unwinds from T4, T5 and T6
。
考虑到OP的片段,可以公平地假设这个问题更具学术性而不是实用性。从这个角度来看——仍然是“不”。不要将数据设计/反规范化到您发现自己处于需要运行此类查询的位置的程度。
这并不能完全解决OP问题。子查询,但展示了如果绝对必要的话,如何使用“临时”集合来代替子查询。请注意,在原始 SQL 中,只有表
T3
.T4
、T5
和 T6
实际存在; T1
和 T2
是子查询结构。此外,原始 SQL 并不是特别有用,因为它只显示匹配的列;我们对这些行从何而来一无所知,因此我们将随身携带 old_id
只是为了展示。
SQL 子查询实际上很可能隐式地执行我们在此处显式显示的操作:创建临时表
T1
和 T2
并且不将其保存在内存中。
db.XT3.drop();
var r = [
{C0: "A", C1: "R0-1", C2: "R0-2"}
,{C0: "A", C1: "R1-1", C2: "R1-2"}
,{C0: "B", C1: "R0-1", C2: "R0-2"}
,{C0: "C", C1: "R0-1", C2: "R0-2"}
];
var nn = 0; r.forEach(function(d) { d['_id'] = nn++ }); // auto incr ID
db.XT3.insertMany(r);
db.XT4.drop();
var r = [
{C0: "A", C1: "R0-1", C2: "R0-2"}
,{C0: "A", C1: "R2-1", C2: "R2-2"}
,{C0: "B", C1: "R0-1", C2: "R0-2"}
,{C0: "no_match", C1: "R0-1", C2: "R0-2"}
];
r.forEach(function(d) { d['_id'] = nn++ }); // auto incr ID
db.XT4.insertMany(r);
db.XT5.drop();
var r = [
{C0: "X", C1: "nope", C2: "nope"}
,{C0: "X", C1: "R1-1", C2: "R1-2"}
,{C0: "Y", C1: "R0-1", C2: "R0-2"}
,{C0: "Z", C1: "R0-1", C2: "R0-2"}
];
r.forEach(function(d) { d['_id'] = nn++ }); // auto incr ID
db.XT5.insertMany(r);
db.XT6.drop();
var r = [
{C0: "X", C1: "R0-1", C2: "R0-2"}
,{C0: "Y", C1: "R0-1", C2: "R0-2"}
,{C0: "oiwejfioj", C1: "R0-1", C2: "R0-2"}
];
r.forEach(function(d) { d['_id'] = nn++ }); // auto incr ID
db.XT6.insertMany(r);
db.TMP1.drop();
db.TMP2.drop();
c=db.XT3.aggregate([
{$lookup: {"from": "XT4",
let: { id: "$C0" },
pipeline: [
{$match: {$expr: {$eq: [ "$C0", "$$id" ]} }}
,{$project: {_id:false,C0:false}} // no need to carry them
],
as: "T_3_4"
}}
// This ALSO acts to filter out items from XT3 with NO match to XT4:
,{$unwind: '$T_3_4'}
// $unwind will dupe _id of inbound XT3 record; must get rid of it
// in prep for $out; keep it around for debugging/tracking as 'old_id':
,{$project: {_id:false, old_id:'$_id', C1:'$C1', C2:'$T_3_4.C2'}}
,{$out: "TMP1"}
]);
db.XT5.aggregate([
{$lookup: {"from": "XT6",
let: { id: "$C0" },
pipeline: [
{$match: {$expr: {$eq: [ "$C0", "$$id" ]} }}
,{$project: {_id:false,C0:false}} // no need to carry them
],
as: "T_5_6"
}}
,{$unwind: '$T_5_6'}
,{$project: {_id:false, old_id:'$_id', C1:'$C1', C2:'$T_5_6.C2'}}
,{$out: "TMP2"}
]);
c = db.TMP1.aggregate([
{$lookup: {"from": "TMP2",
let: { c1: "$C1", c2: "$C2" },
// ON T1.C1 = T2.C1 AND T1.C2 = T2.C2;
pipeline: [
{$match: {$expr: {$and: [
{$eq: [ "$$c1", "$C1" ]},
{$eq: [ "$$c2", "$C2" ]}
]} }}
,{$project: {_id:false}}
],
as: "X"
}}
// Filter out non-matches; comment out this stage to double check
// docs where T1.C1 = T2.C1 AND T1.C2 = T2.C2 is NOT satisfied:
,{$match: {$expr: {$ne:[0, {$size:'$X'}]} }}
]);
emit(c);
产量
{
_id: ObjectId("655e1e5a4e762532ed9ba7aa"),
old_id: 0,
C1: 'R0-1',
C2: 'R0-2',
X: [
{
old_id: 10,
C1: 'R0-1',
C2: 'R0-2'
}
]
}
{
_id: ObjectId("655e1e5a4e762532ed9ba7ac"),
old_id: 1,
C1: 'R1-1',
C2: 'R0-2',
X: [
{
old_id: 9,
C1: 'R1-1',
C2: 'R0-2'
}
]
}
{
_id: ObjectId("655e1e5a4e762532ed9ba7ae"),
old_id: 2,
C1: 'R0-1',
C2: 'R0-2',
X: [
{
old_id: 10,
C1: 'R0-1',
C2: 'R0-2'
}
]
}