显然,当查询具有 ORDER BY + GROUP BY + GROUP_CONCAT() + COUNT(*) OVER() 窗口函数时,排序在 mysql 8 中以某种方式错误地应用(在 8.0.33 到 8.0.35 上检查)。请参阅下面的测试用例(注意,为了清晰起见,它是合成的且过于简化 - 显然在实际案例中分组不会在单个表上运行)。
问题 1:为什么在第一个查询中排序没有按照我的预期应用(即按
sort
字段升序)?
问题 2:建议的修复(或解决方法)是什么?将排序规范复制到 COUNT(*) OVER() 中似乎不是一个非常优雅或强大的解决方案。
CREATE TABLE IF NOT EXISTS users (
id INT AUTO_INCREMENT PRIMARY KEY,
username VARCHAR(255),
email VARCHAR(255),
sort INT
);
INSERT INTO users (username, email, sort) VALUES ('user1', '[email protected]', 50);
INSERT INTO users (username, email, sort) VALUES ('user2', '[email protected]', 30);
INSERT INTO users (username, email, sort) VALUES ('user3', '[email protected]', 20);
INSERT INTO users (username, email, sort) VALUES ('user4', '[email protected]', 90);
INSERT INTO users (username, email, sort) VALUES ('user5', '[email protected]', 40);
INSERT INTO users (username, email, sort) VALUES ('user6', '[email protected]', 70);
查询:
SELECT
sort,
username,
GROUP_CONCAT(email) AS email_concat,
COUNT(*) OVER () AS total_count
FROM users
GROUP BY id
ORDER BY sort;
结果(意外,不是
sort
订购的):
| sort | username | email_concat | total_count |
| ---- | -------- | ----------------- | ----------- |
| 50 | user1 | [email protected] | 6 |
| 30 | user2 | [email protected] | 6 |
| 20 | user3 | [email protected] | 6 |
| 90 | user4 | [email protected] | 6 |
| 40 | user5 | [email protected] | 6 |
| 70 | user6 | [email protected] | 6 |
执行计划:
-> Window aggregate with buffering: count(0) OVER ()
-> Table scan on <temporary> (cost=2.5..2.5 rows=0)
-> Temporary table (cost=0..0 rows=0)
-> Group aggregate: group_concat(users.email separator ',')
-> Sort: users.id
-> Stream results (cost=0.85 rows=6)
-> Sort: users.sort (cost=0.85 rows=6)
-> Table scan on users (cost=0.85 rows=6)
但是,如果我们将 ORDER BY 添加到窗口函数中(根据我的理解,在这种情况下是无操作/冗余),则会按预期应用排序:
SELECT
sort,
username,
GROUP_CONCAT(email) AS email_concat,
COUNT(*) OVER (ORDER BY sort) AS total_count
FROM users
GROUP BY id
ORDER BY sort;
结果(根据需要,按
sort
排序):
| sort | username | email_concat | total_count |
| ---- | -------- | ----------------- | ----------- |
| 20 | user3 | [email protected] | 1 |
| 30 | user2 | [email protected] | 2 |
| 40 | user5 | [email protected] | 3 |
| 50 | user1 | [email protected] | 4 |
| 70 | user6 | [email protected] | 5 |
| 90 | user4 | [email protected] | 6 |
执行计划:
-> Sort: users.sort
-> Table scan on <temporary> (cost=2.5..2.5 rows=0)
-> Temporary table (cost=0..0 rows=0)
-> Window aggregate with buffering: count(0) OVER (ORDER BY users.sort )
-> Sort: users.sort
-> Stream results
-> Group aggregate: group_concat(users.email separator ',')
-> Sort: users.id
-> Stream results (cost=0.85 rows=6)
-> Table scan on users (cost=0.85 rows=6)
这似乎是使用
GROUP_CONCAT()
触发的错误,因为删除它会产生正确的排序。由于您按主键进行分组,因此聚合是多余的,因为每个组只有一行(如果您与具有多对一关系的另一个表连接,聚合将很有用)。
解决方法是将分组查询放在子查询中,并在主查询中使用
ORDER BY sort
。
SELECT *
FROM (
SELECT
sort,
username,
GROUP_CONCAT(email) AS email_concat,
COUNT(*) OVER () AS total_count
FROM users
GROUP BY id) AS x
ORDER BY sort;
如果您想获得运行计数而不是总数,则需要使用
OVER (ORDER BY id)
。这也以某种方式解决了这个错误。