查看 OReilly 的一本旧书 Transact SQL Cookbook,其中一节描述了如何在原始 SQL 中执行集合比较。
CREATE TABLE branch_book_list (
branch_name CHAR(10),
book_ISBN CHAR (13),
book_name CHAR(40),
PRIMARY KEY (branch_name, book_ISBN)
表:
branch_name book_ISBN book_name
----------- ------------- ----------------------------------------
Branch A 1-56592-401-0 Transact-SQL Programming
Branch A 1-56592-578-5 Oracle SQL*Plus: The Definitive Guide
Branch A 1-56592-756-7 Transact-SQL Cookbook
Branch B 1-56592-401-0 Transact-SQL Programming
Branch B 1-56592-756-7 Transact-SQL Cookbook
Branch B 1-56592-948-9 Oracle SQL*Loader: The Definitive Guide
查找不在其他集中的行 首先,我们编写一个查询来查找分行 A 持有但分行 B 不持有的所有书籍。使用 SQL Server,我们可以通过使用子查询来实现此目标,如下所示:
SELECT bbl1.*
FROM branch_book_list bbl1
WHERE branch_name = 'Branch A'
AND NOT EXISTS (
SELECT bbl2.*
FROM branch_book_list bbl2
WHERE branch_name = 'Branch B'
AND bbl1.book_ISBN = bbl2.book_ISBN
AND bbl1.book_name = bbl2.book_name)
输出:
branch_name book_ISBN book_name
----------- ------------- ----------------------------------------
Branch A 1-56592-578-5 Oracle SQL*Plus: The Definitive Guide
现在的问题是,
使用ORM框架的SQLAlchemy中对应的查询应该是怎样的?
你的模型会是这样的:
class BranchBookList(Base):
__tablename__ = 'branch_book_list'
branch_name = Column(Unicode)
book_ISBN = Column(Unicode)
book_name = Column(Unicode)
__table_args__ = (PrimaryKeyConstraint(branch_name, book_ISBN),)
Session
和 Base
。然后是与建议结果相同的查询:
>>> from sqlalchemy.orm.util import aliased, and_
>>> from module_with_bbl import BranchBookList, Session
>>> bbl1 = Session.query(BranchBookList).filter(BranchBookList.branch_name == 'Branch A').subquery()
>>> bbl2 = Session.query(BranchBookList).filter(BranchBookList.branch_name == 'Branch B').subquery()
>>> query = Session.query(bbl1).outerjoin(bbl2, and_(bbl1.c.book_name == bbl2.c.book_name, bbl1.c.book_ISBN == >>> bbl2.c.book_ISBN)).filter(bbl2.c.book_ISBN == None)
>>> print(query)
SELECT anon_1.branch_name AS anon_1_branch_name, anon_1."book_ISBN" AS "anon_1_book_ISBN", anon_1.book_name AS anon_1_book_name
FROM (SELECT branch_book_list.branch_name AS branch_name, branch_book_list."book_ISBN" AS "book_ISBN", branch_book_list.book_name AS book_name
FROM branch_book_list
WHERE branch_book_list.branch_name = :branch_name_1) AS anon_1 LEFT OUTER JOIN (SELECT branch_book_list.branch_name AS branch_name, branch_book_list."book_ISBN" AS "book_ISBN", branch_book_list.book_name AS book_name
FROM branch_book_list
WHERE branch_book_list.branch_name = :branch_name_2) AS anon_2 ON anon_1.book_name = anon_2.book_name AND anon_1."book_ISBN" = anon_2."book_ISBN"
WHERE anon_2."book_ISBN" IS NULL
>>> query.all()
[(u'Branch A', u'1-56592-578-5', u'Oracle SQL*Loader: The Definitive Guide')]
我几乎 100% 确信我的版本更好,因为建议查询中的子查询将为
branch_book_table
中的每一行执行。因此,如果表中有数千行,则有数千个额外的子查询,而我的查询只有两个查询。
对不起我的英语:)
假设您的 RDBMS 支持它们,您可以使用 SQL 集合操作 - 在本例中
EXCEPT
- 来查找两个集合之间的差异。
# Select the Branch A set.
q_a = sa.select(tbl.c.book_ISBN, tbl.c.book_name).where(tbl.c.branch_name == 'Branch A')
# Select the Branch B set.
q_b = sa.select(tbl.c.book_ISBN, tbl.c.book_name).where(tbl.c.branch_name == 'Branch B')
# Select the diffence between the two sets.
q = q_a.except_(q_b).subquery()
qq = sa.select(
sa.literal('Branch A').label('branch_name'), q.c.book_ISBN, q.c.book_name
)
with engine.connect() as conn:
rows = conn.execute(q)
for row in rows.mappings():
print(row)
发出的 SQL:
SELECT tbl."book_ISBN", tbl.book_name
FROM tbl
WHERE tbl.branch_name = 'Branch A' EXCEPT SELECT tbl."book_ISBN", tbl.book_name
FROM tbl
WHERE tbl.branch_name = 'Branch B'
上面的查询不会输出分支名称,这可能没问题,因为根据定义,结果将来自“分支 A”查询。如果需要分支名称:
q = q_a.except_(q_b).subquery()
q = sa.select(
sa.literal('Branch A').label('branch_name'), q.c.book_ISBN, q.c.book_name
)
这只会输出 ISBN 和名称:这可能并不重要,因为
有关对称差异,请参阅此处的示例(PostgreSQl,但应与 MSSQL 一起使用)。