对同一个表进行多个联接的 SQL 查询

问题描述 投票:0回答:1

我有一个查询(简化以显示问题,但删除了 7 个类似的连接):

SELECT 
    `s`.`id`, `s`.`mobile_number`, MAX(`s`.`row_number`), `s`.`campaign_name`, `s`.`createdate`, `s`.`moddate`, 
    `se1`.`column_value` AS `first_name`, 
    `se2`.`column_value` AS `last_name`
FROM `kcms_shopper` `s`
LEFT JOIN `kcms_shopper_extend` `se1` 
    ON `s`.`mobile_number` = `se1`.`mobile_number` 
    AND `s`.`campaign_name` = `se1`.`campaign_name`
    AND `s`.`row_number` = `se1`.`row_number`
LEFT JOIN `kcms_shopper_extend` `se2` 
    ON `s`.`mobile_number` = `se2`.`mobile_number` 
    AND `s`.`campaign_name` = `se2`.`campaign_name`
    AND `s`.`row_number` = `se1`.`row_number`
WHERE `s`.`row_number` = (
    SELECT MAX(`row_number`) 
    FROM `kcms_shopper_extend` sx 
    WHERE `s`.`mobile_number` = `sx`.`mobile_number`
    AND `s`.`campaign_name` = `sx`.`campaign_name`
)
AND `se1`.`column_name` = "first_name"
AND `se2`.`column_name` = "last_name"
GROUP BY `s`.`mobile_number`, `s`.`row_number`
ORDER BY `s`.`mobile_number` ASC

目标是从表

shopper
获取数据并在
shopper_extend
上多次连接。

每个

shopper
可以有多行(如果他们使用手机号码多次进入营销活动),并且每个营销活动可以有一组自定义配置的每个营销活动捕获的列,因此是连接表。

shopper
的结构如下:

CREATE TABLE `kcms_shopper` (
    `id` int(11) NOT NULL,
    `mobile_number` varchar(16) NOT NULL,
    `campaign_name` varchar(64) NOT NULL,
    `row_number` int(11) NOT NULL,
    `createdate` datetime NOT NULL DEFAULT current_timestamp(),
    `moddate` datetime NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp()
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

ALTER TABLE `kcms_shopper`
    ADD PRIMARY KEY (`id`),
    ADD KEY `ix__mobile_number` (`mobile_number`) USING BTREE,
    ADD KEY `ix__campaign_name` (`campaign_name`);

ALTER TABLE `kcms_shopper`
    MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;

shopper_extend
的结构如下:

CREATE TABLE `kcms_shopper_extend` (
    `id` int(11) NOT NULL,
    `shopper_id` int(11) NOT NULL,
    `mobile_number` varchar(16) NOT NULL,
    `campaign_name` varchar(64) NOT NULL,
    `row_number` int(11) NOT NULL,
    `column_name` varchar(64) NOT NULL,
    `column_value` varchar(4096) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

ALTER TABLE `kcms_shopper_extend`
    ADD PRIMARY KEY (`id`),
    ADD KEY `ix__column_name` (`column_name`) USING BTREE,
    ADD KEY `ix__mobile_number` (`mobile_number`),
    ADD KEY `ix__campaign_name` (`campaign_name`);

ALTER TABLE `kcms_shopper_extend`
    MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;    

请协助我找回:

1. The last entry of a user (row_number)
2. For a specific campaign
3. Using a specific mobile number

上面的查询还没有导致错误,但我认为它是错误的,因为它没有完成。它挂起我的 MySQL 至少 10 分钟(完成此题时,查询尚未完成。

mysql query-optimization
1个回答
0
投票

我发现您的表中没有定义索引。索引对于优化连接很重要。您可以通过使用 EXPLAIN 分析查询来验证查询是否使用索引。

我用 EXPLAIN 测试了您的查询,发现它正在“以困难的方式”进行连接,这由“使用连接缓冲区(散列连接)”指示。

子查询也是一个“从属子查询”,这意味着它必须执行多次,每次与外部查询中的每个不同值进行比较。这对于性能来说非常昂贵。

+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------+
| id | select_type        | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                        |
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------+
|  1 | PRIMARY            | se1   | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |    25.00 | Using where; Using temporary; Using filesort |
|  1 | PRIMARY            | s     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |    33.33 | Using where; Using join buffer (hash join)   |
|  1 | PRIMARY            | se2   | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |    25.00 | Using where; Using join buffer (hash join)   |
|  2 | DEPENDENT SUBQUERY | sx    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    4 |    25.00 | Using where                                  |
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------+

然后我创建了一些我认为有帮助的索引:

ALTER TABLE kcms_shopper
    ADD PRIMARY KEY (id),
    ADD INDEX bk1 (`row_number`),
    ADD INDEX bk2 (`mobile_number`, `campaign_name`, `row_number`);

ALTER TABLE kcms_shopper_extend
    ADD PRIMARY KEY (id),
    ADD INDEX bk3 (mobile_number, campaign_name, `row_number`, column_name);

我将您的子查询重构为另一个 OUTER JOIN。我已经使用此方法来实现每组最大行类型的查询模式。这允许使用索引优化连接,就像任何其他连接一样。

SELECT
    `s`.`id`,
    `s`.`mobile_number`,
    MAX(`s`.`row_number`),
    `s`.`campaign_name`,
    `s`.`createdate`,
    `s`.`moddate`,
    `se1`.`column_value` AS `first_name`,
    `se2`.`column_value` AS `last_name`
FROM `kcms_shopper` `s`
LEFT JOIN `kcms_shopper_extend` `se1`
    ON `s`.`mobile_number` = `se1`.`mobile_number`
    AND `s`.`campaign_name` = `se1`.`campaign_name`
    AND `s`.`row_number` = `se1`.`row_number`
LEFT JOIN `kcms_shopper_extend` `se2`
    ON `s`.`mobile_number` = `se2`.`mobile_number`
    AND `s`.`campaign_name` = `se2`.`campaign_name`
    AND `s`.`row_number` = `se1`.`row_number`
LEFT JOIN `kcms_shopper` `sx`
    ON `s`.`mobile_number` = `sx`.`mobile_number`
    AND `s`.`mobile_number` = `sx`.`mobile_number`
    AND `s`.`row_number` < `sx`.`row_number`
WHERE `sx`.`row_number` IS NULL
AND `se1`.`column_name` = "first_name"
AND `se2`.`column_name` = "last_name"
GROUP BY `s`.`mobile_number`, `s`.`row_number`
ORDER BY `s`.`mobile_number` ASC;

EXPLAIN 分析表明它对所有连接都使用了索引。它仍然需要对第一个表进行表扫描,但其他表都通过索引查找来解决(在 EXPLAIN 报告中由

type: ref
表示)。

+----+-------------+-------+------------+------+---------------+------+---------+-----------------------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref                                           | rows | filtered | Extra                                        |
+----+-------------+-------+------------+------+---------------+------+---------+-----------------------------------------------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | se1   | NULL       | ALL  | bk3           | NULL | NULL    | NULL                                          |    4 |    25.00 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | s     | NULL       | ref  | bk1,bk2       | bk1  | 4       | test.se1.row_number                           |    1 |    25.00 | Using where                                  |
|  1 | SIMPLE      | se2   | NULL       | ref  | bk3           | bk3  | 324     | test.se1.mobile_number,test.se1.campaign_name |    1 |    25.00 | Using index condition                        |
|  1 | SIMPLE      | sx    | NULL       | ref  | bk1,bk2       | bk2  | 66      | test.se1.mobile_number                        |    1 |    25.00 | Using where; Not exists; Using index         |
+----+-------------+-------+------------+------+---------------+------+---------+-----------------------------------------------+------+----------+----------------------------------------------+

您可能喜欢我的演示文稿真正如何设计索引,或 我介绍它。

我还在我的书SQL反模式卷1:避免数据库编程的陷阱中写了关于索引的一章。

© www.soinside.com 2019 - 2024. All rights reserved.