我有一个查询(简化以显示问题,但删除了 7 个类似的连接):
SELECT
`s`.`id`, `s`.`mobile_number`, MAX(`s`.`row_number`), `s`.`campaign_name`, `s`.`createdate`, `s`.`moddate`,
`se1`.`column_value` AS `first_name`,
`se2`.`column_value` AS `last_name`
FROM `kcms_shopper` `s`
LEFT JOIN `kcms_shopper_extend` `se1`
ON `s`.`mobile_number` = `se1`.`mobile_number`
AND `s`.`campaign_name` = `se1`.`campaign_name`
AND `s`.`row_number` = `se1`.`row_number`
LEFT JOIN `kcms_shopper_extend` `se2`
ON `s`.`mobile_number` = `se2`.`mobile_number`
AND `s`.`campaign_name` = `se2`.`campaign_name`
AND `s`.`row_number` = `se1`.`row_number`
WHERE `s`.`row_number` = (
SELECT MAX(`row_number`)
FROM `kcms_shopper_extend` sx
WHERE `s`.`mobile_number` = `sx`.`mobile_number`
AND `s`.`campaign_name` = `sx`.`campaign_name`
)
AND `se1`.`column_name` = "first_name"
AND `se2`.`column_name` = "last_name"
GROUP BY `s`.`mobile_number`, `s`.`row_number`
ORDER BY `s`.`mobile_number` ASC
目标是从表
shopper
获取数据并在 shopper_extend
上多次连接。
每个
shopper
可以有多行(如果他们使用手机号码多次进入营销活动),并且每个营销活动可以有一组自定义配置的每个营销活动捕获的列,因此是连接表。
shopper
的结构如下:
CREATE TABLE `kcms_shopper` (
`id` int(11) NOT NULL,
`mobile_number` varchar(16) NOT NULL,
`campaign_name` varchar(64) NOT NULL,
`row_number` int(11) NOT NULL,
`createdate` datetime NOT NULL DEFAULT current_timestamp(),
`moddate` datetime NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp()
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
ALTER TABLE `kcms_shopper`
ADD PRIMARY KEY (`id`),
ADD KEY `ix__mobile_number` (`mobile_number`) USING BTREE,
ADD KEY `ix__campaign_name` (`campaign_name`);
ALTER TABLE `kcms_shopper`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;
shopper_extend
的结构如下:
CREATE TABLE `kcms_shopper_extend` (
`id` int(11) NOT NULL,
`shopper_id` int(11) NOT NULL,
`mobile_number` varchar(16) NOT NULL,
`campaign_name` varchar(64) NOT NULL,
`row_number` int(11) NOT NULL,
`column_name` varchar(64) NOT NULL,
`column_value` varchar(4096) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
ALTER TABLE `kcms_shopper_extend`
ADD PRIMARY KEY (`id`),
ADD KEY `ix__column_name` (`column_name`) USING BTREE,
ADD KEY `ix__mobile_number` (`mobile_number`),
ADD KEY `ix__campaign_name` (`campaign_name`);
ALTER TABLE `kcms_shopper_extend`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;
请协助我找回:
1. The last entry of a user (row_number)
2. For a specific campaign
3. Using a specific mobile number
上面的查询还没有导致错误,但我认为它是错误的,因为它没有完成。它挂起我的 MySQL 至少 10 分钟(完成此题时,查询尚未完成。
我发现您的表中没有定义索引。索引对于优化连接很重要。您可以通过使用 EXPLAIN 分析查询来验证查询是否使用索引。
我用 EXPLAIN 测试了您的查询,发现它正在“以困难的方式”进行连接,这由“使用连接缓冲区(散列连接)”指示。
子查询也是一个“从属子查询”,这意味着它必须执行多次,每次与外部查询中的每个不同值进行比较。这对于性能来说非常昂贵。
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------+
| 1 | PRIMARY | se1 | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 25.00 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | s | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where; Using join buffer (hash join) |
| 1 | PRIMARY | se2 | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 25.00 | Using where; Using join buffer (hash join) |
| 2 | DEPENDENT SUBQUERY | sx | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 25.00 | Using where |
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------+
然后我创建了一些我认为有帮助的索引:
ALTER TABLE kcms_shopper
ADD PRIMARY KEY (id),
ADD INDEX bk1 (`row_number`),
ADD INDEX bk2 (`mobile_number`, `campaign_name`, `row_number`);
ALTER TABLE kcms_shopper_extend
ADD PRIMARY KEY (id),
ADD INDEX bk3 (mobile_number, campaign_name, `row_number`, column_name);
我将您的子查询重构为另一个 OUTER JOIN。我已经使用此方法来实现每组最大行类型的查询模式。这允许使用索引优化连接,就像任何其他连接一样。
SELECT
`s`.`id`,
`s`.`mobile_number`,
MAX(`s`.`row_number`),
`s`.`campaign_name`,
`s`.`createdate`,
`s`.`moddate`,
`se1`.`column_value` AS `first_name`,
`se2`.`column_value` AS `last_name`
FROM `kcms_shopper` `s`
LEFT JOIN `kcms_shopper_extend` `se1`
ON `s`.`mobile_number` = `se1`.`mobile_number`
AND `s`.`campaign_name` = `se1`.`campaign_name`
AND `s`.`row_number` = `se1`.`row_number`
LEFT JOIN `kcms_shopper_extend` `se2`
ON `s`.`mobile_number` = `se2`.`mobile_number`
AND `s`.`campaign_name` = `se2`.`campaign_name`
AND `s`.`row_number` = `se1`.`row_number`
LEFT JOIN `kcms_shopper` `sx`
ON `s`.`mobile_number` = `sx`.`mobile_number`
AND `s`.`mobile_number` = `sx`.`mobile_number`
AND `s`.`row_number` < `sx`.`row_number`
WHERE `sx`.`row_number` IS NULL
AND `se1`.`column_name` = "first_name"
AND `se2`.`column_name` = "last_name"
GROUP BY `s`.`mobile_number`, `s`.`row_number`
ORDER BY `s`.`mobile_number` ASC;
EXPLAIN 分析表明它对所有连接都使用了索引。它仍然需要对第一个表进行表扫描,但其他表都通过索引查找来解决(在 EXPLAIN 报告中由
type: ref
表示)。
+----+-------------+-------+------------+------+---------------+------+---------+-----------------------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-----------------------------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | se1 | NULL | ALL | bk3 | NULL | NULL | NULL | 4 | 25.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | s | NULL | ref | bk1,bk2 | bk1 | 4 | test.se1.row_number | 1 | 25.00 | Using where |
| 1 | SIMPLE | se2 | NULL | ref | bk3 | bk3 | 324 | test.se1.mobile_number,test.se1.campaign_name | 1 | 25.00 | Using index condition |
| 1 | SIMPLE | sx | NULL | ref | bk1,bk2 | bk2 | 66 | test.se1.mobile_number | 1 | 25.00 | Using where; Not exists; Using index |
+----+-------------+-------+------------+------+---------------+------+---------+-----------------------------------------------+------+----------+----------------------------------------------+
您可能喜欢我的演示文稿真正如何设计索引,或 我介绍它。
我还在我的书SQL反模式卷1:避免数据库编程的陷阱中写了关于索引的一章。