我在索引方面哪里出了问题？还是别的什么？

Question

请帮助我优化启动的单个查询。故事是这样的：我有一个名为“temp”的表，其中统计行在一天中非常频繁地插入（每天以不规则的时间模式插入大约 30 - 5000 万行）。每天一次（午夜之后），我的工作是选择 30-5000 万行，分组并计算一些数据，然后插入“计算”表中。

由于行数太多，我决定最好运行一个查询来按小时选择数据，所以我基本上运行了 24 个选择查询。问题是，选择每小时数据的查询非常慢。谈论大约90秒。

首先，一些基本数据。我使用的是MariaDB，引擎是InnoDB。“临时”表的结构是这样的：

CREATE TABLE temp (
    id                  char(36)              NOT NULL default uuid() PRIMARY KEY,
    device              tinyint    unsigned   NOT NULL,
    country_code        varchar(2)            NOT NULL,
    canvas_id           bigint     unsigned   NOT NULL,
    paid_configured     int        unsigned   NOT NULL,
    paid_count          int        unsigned   NOT NULL,
    non_paid_configured int        unsigned   NOT NULL,
    non_paid_count      int        unsigned   NOT NULL,
    timestamp           timestamp             NOT NULL default current_timestamp() 

) engine = InnoDB;

我有一个索引：

create index temp_composite_index on temp (
    timestamp,
    canvas_id,
    device,
    country_code
)

我试图优化的查询是：

    SELECT  canvas_id AS canvas_id,
            device AS device,
            country_code AS country_code,
            SUM(paid_configured) AS paid_configured_sum,
            SUM(paid_count) AS paid_count_sum,
            SUM(non_paid_configured) AS non_paid_configured_sum,
            SUM(non_paid_count) AS non_paid_count_sum
    FROM temp
    WHERE timestamp BETWEEN '2023-12-02 12:00:00' AND '2023-12-02 12:59:59' 
    GROUP BY canvas_id, device, country_code;

解释结果为：

{
    "query_block": {
        "select_id": 1,
        "filesort": {
            "sort_key": "temp.canvas_id, temp.device, temp.country_code",
            "temporary_table": {
                "table": {
                    "table_name": "temp",
                    "access_type": "range",
                    "possible_keys": [
                        "temp_composite_index"
                    ],
                    "key": "temp_composite_index",
                    "key_length": "4",
                    "used_key_parts": [
                        "timestamp"
                    ],
                    "rows": 2609006,
                    "filtered": 100,
                    "index_condition": "temp.timestamp between '2023-12-10 12:00:00.000000' and '2023-12-10 12:59:59.000000'"
                }
            }
        }
    }
}

其他数据：

{
    "rows_total": 30000000,
    "rows_between_timestamps": 1249369,
    "unique_combinations": {
        "canvas_id": 20,
        "device": 2,
        "country_code": 4
    }
}

我尝试了很多索引组合和顺序，还更改了 where 和 group by 列的顺序，但似乎没有任何效果。如果您需要任何其他信息，请随时询问。谢谢！

编辑：

我不熟悉为什么它决定使用UUID而不是BIGINT AUTO_INCRMENT。
是的，我插入时总是使用当前时间戳。
MariaDB 版本 10.6
它符合图案。

Answer 1

对此的一些观察：

90 秒从活动表中总结几个巨行并不是慢得离谱。没有什么奇怪的 SQL 巫术可以让它变得更快。
看起来您的结果集相当小——每小时 160 行，或多或少。
您正在对仅在最近时间获取 INSERT 操作的表进行历史报告。
如果您可以更改，
```
BIGINT AUTO_INCREMENT
```
主键将比
```
DEFAULT UUID()
```
主键更有效。
使用
```
BETWEEN
```
作为日期范围过滤器并不明智，因为范围的末尾包含在内。使用它，注意
```
<
```
比较运算符。
```
WHERE timestamp >= '2023-12-02 10:00' AND timestamp < '2023-12-02 11:00'  
```

我的建议。

在一个查询中完成一整天的报告，而不是每小时运行一个单独的查询。像这样的东西：

 SELECT  HOUR(timestamp) AS hour,
         canvas_id AS canvas_id,
         device AS device,
         country_code AS country_code,
         SUM(paid_configured) AS paid_configured_sum,
         SUM(paid_count) AS paid_count_sum,
         SUM(non_paid_configured) AS non_paid_configured_sum,
         SUM(non_paid_count) AS non_paid_count_sum
  FROM temp
 WHERE timestamp >= '2023-12-02'
   AND timestamp < '2023-12-02' + INTERVAL 1 DAY 
 GROUP BY HOUR(timestamp), canvas_id, device, country_code;

使用您已有的索引，其工作量与运行 24 个单独的有时间限制的查询大致相同。

因为您正在仅 INSERT 表上执行历史报告，所以您可以将 InnoDB 的事务隔离设置为允许级别，以减少 INSERT 操作和报告之间的争用。在报告查询之前给出此 SQL 命令。（不要在具有更复杂事务模式的数据库中执行此操作，除非您研究它的作用并说服您的利益相关者它不会产生虚假结果。）
```
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
```
您可以考虑使用所谓的“覆盖索引”，通过顺序扫描索引来满足整个查询。这将为您的表添加 SSD/HDD 空间，但如果您迫切需要快速执行此报告查询，则这种权衡可能是值得的。 ALTER TABLE temp DROP INDEX temp_composite_index, ADD INDEX temp_composite_index (timestamp, canvas_id, device, country_code, paid_configured, paid_count, non_paid_configured, non_paid_count);

我在索引方面哪里出了问题？还是别的什么？

问题描述投票：0回答：1

1个回答

最新问题

我在索引方面哪里出了问题？还是别的什么？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1