MySQL:使用 ORDER BY + GROUP BY + GROUP_CONCAT() + COUNT(*) OVER() 窗口函数查询时排序顺序错误

问题描述 投票:0回答:1

显然,当查询具有 ORDER BY + GROUP BY + GROUP_CONCAT() + COUNT(*) OVER() 窗口函数时,排序在 mysql 8 中以某种方式错误地应用(在 8.0.33 到 8.0.35 上检查)。请参阅下面的测试用例(注意,为了清晰起见,它是合成的且过于简化 - 显然在实际案例中分组不会在单个表上运行)。

问题 1:为什么在第一个查询中排序没有按照我的预期应用(即按

sort
字段升序)?

问题 2:建议的修复(或解决方法)是什么?将排序规范复制到 COUNT(*) OVER() 中似乎不是一个非常优雅或强大的解决方案。

架构和测试数据

CREATE TABLE IF NOT EXISTS users (
    id INT AUTO_INCREMENT PRIMARY KEY,
    username VARCHAR(255),
    email VARCHAR(255),
    sort INT
);

INSERT INTO users (username, email, sort) VALUES ('user1', '[email protected]', 50);
INSERT INTO users (username, email, sort) VALUES ('user2', '[email protected]', 30);
INSERT INTO users (username, email, sort) VALUES ('user3', '[email protected]', 20);
INSERT INTO users (username, email, sort) VALUES ('user4', '[email protected]', 90);
INSERT INTO users (username, email, sort) VALUES ('user5', '[email protected]', 40);
INSERT INTO users (username, email, sort) VALUES ('user6', '[email protected]', 70);

意想不到的结果

查询:

SELECT
    sort,
    username,
    GROUP_CONCAT(email) AS email_concat,
    COUNT(*) OVER () AS total_count
FROM users
GROUP BY id
ORDER BY sort;

结果(意外,不是

sort
订购的):

| sort | username | email_concat      | total_count |
| ---- | -------- | ----------------- | ----------- |
| 50   | user1    | [email protected] | 6           |
| 30   | user2    | [email protected] | 6           |
| 20   | user3    | [email protected] | 6           |
| 90   | user4    | [email protected] | 6           |
| 40   | user5    | [email protected] | 6           |
| 70   | user6    | [email protected] | 6           |

执行计划:

 -> Window aggregate with buffering: count(0) OVER () 
    -> Table scan on <temporary>  (cost=2.5..2.5 rows=0)
        -> Temporary table  (cost=0..0 rows=0)
            -> Group aggregate: group_concat(users.email separator ',')
                -> Sort: users.id
                    -> Stream results  (cost=0.85 rows=6)
                        -> Sort: users.sort  (cost=0.85 rows=6)
                            -> Table scan on users  (cost=0.85 rows=6)

DB Fiddle 1(错误结果)

预期结果

但是,如果我们将 ORDER BY 添加到窗口函数中(根据我的理解,在这种情况下是无操作/冗余),则会按预期应用排序:

SELECT
    sort,
    username,
    GROUP_CONCAT(email) AS email_concat,
    COUNT(*) OVER (ORDER BY sort) AS total_count
FROM users
GROUP BY id
ORDER BY sort;

结果(根据需要,按

sort
排序):

| sort | username | email_concat      | total_count |
| ---- | -------- | ----------------- | ----------- |
| 20   | user3    | [email protected] | 1           |
| 30   | user2    | [email protected] | 2           |
| 40   | user5    | [email protected] | 3           |
| 50   | user1    | [email protected] | 4           |
| 70   | user6    | [email protected] | 5           |
| 90   | user4    | [email protected] | 6           |

执行计划:

 -> Sort: users.sort
    -> Table scan on <temporary>  (cost=2.5..2.5 rows=0)
        -> Temporary table  (cost=0..0 rows=0)
            -> Window aggregate with buffering: count(0) OVER (ORDER BY users.sort ) 
                -> Sort: users.sort
                    -> Stream results
                        -> Group aggregate: group_concat(users.email separator ',')
                            -> Sort: users.id
                                -> Stream results  (cost=0.85 rows=6)
                                    -> Table scan on users  (cost=0.85 rows=6)

DB Fiddle 2(正确结果)

mysql sql-order-by window-functions group-concat mysql-8.0
1个回答
0
投票

这似乎是使用

GROUP_CONCAT()
触发的错误,因为删除它会产生正确的排序。由于您按主键进行分组,因此聚合是多余的,因为每个组只有一行(如果您与具有多对一关系的另一个表连接,聚合将很有用)。

解决方法是将分组查询放在子查询中,并在主查询中使用

ORDER BY sort

SELECT *
FROM (
  SELECT
      sort,
      username,
      GROUP_CONCAT(email) AS email_concat,
      COUNT(*) OVER () AS total_count
  FROM users
  GROUP BY id) AS x
ORDER BY sort;

如果您想获得运行计数而不是总数,则需要使用

OVER (ORDER BY id)
。这也以某种方式解决了这个错误。

© www.soinside.com 2019 - 2024. All rights reserved.