假设我有三个表:发货,客户和商店。 shipments表具有两个索引:INT类型的customer_id(引用客户表)和datetime类型的date。 customers表具有一个索引:INT类型的store_id(引用商店表)。
如果我按日期筛选货运,我会看到正在使用date索引:
EXPLAIN extended SELECT * FROM shipments
WHERE date >= '2020-04-01' AND date <= '2020-05-01';
+----+-------------+-----------+-------+---------------+------+---------+-------+--------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+-------+---------------+------+---------+-------+--------+----------+-------------+
| 1 | SIMPLE | shipments | range | date | date | 9 | NULL | 250796 | 100.00 | Using where |
+----+-------------+-----------+-------+---------------+------+---------+-------+--------+------------------------+
但是,接下来的两个查询的输出使我感到困惑,因为它几乎相同:
EXPLAIN extended SELECT shipments.* FROM shipments
LEFT JOIN customers ON shipments.customer_id = customers.id
WHERE customers.store_id = 100 AND
shipments.date >= '2020-04-01 00:0:00.0' AND shipments.date <= '2020-05-01 00:0:00.0';
+----+-------------+-----------+-------+-------------------+-------------+---------+---------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+-------+-------------------+-------------+---------+---------------+--------+----------+--------------------------+
| 1 | SIMPLE | customers | ref | PRIMARY, store_id | store_id | 5 | const | 38 | 100.00 | Using where; Using index |
+----+-------------+-----------+-------+-------------------+-------------+---------+---------------+--------+----------+--------------------------+
| 1 | SIMPLE | shipments | ref | customer_id, date | customer_id | 5 | customers.id | 663 | 100.00 | Using where |
+----+-------------+-----------+-------+-------------------+-------------+---------+---------------+--------+-------------------------------------+
EXPLAIN extended SELECT shipments.* FROM shipments
LEFT JOIN customers ON shipments.customer_id = customers.id
WHERE customers.store_id = 100;
+----+-------------+-----------+-------+-------------------+-------------+---------+---------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+-------+-------------------+-------------+---------+---------------+--------+----------+--------------------------+
| 1 | SIMPLE | customers | ref | PRIMARY, store_id | store_id | 5 | const | 38 | 100.00 | Using where; Using index |
+----+-------------+-----------+-------+-------------------+-------------+---------+---------------+--------+----------+--------------------------+
| 1 | SIMPLE | shipments | ref | customer_id | customer_id | 5 | customers.id | 663 | 100.00 | Using where |
+----+-------------+-----------+-------+-------------------+-------------+---------+---------------+--------+-------------------------------------+
问题1:此输出是否表示这两个查询中的第一个根本不使用date索引?我已经读过MySQL不会在每个表中使用多个索引,所以我的date索引在性能方面有什么不同吗? (在我的程序中,所有按日期范围过滤的查询看起来都非常像那个。)假设同时有大量的客户和大量的出货量以及大量的此类查询,那么我应该如何改善性能?
Questionnº2:对于这两个查询,为什么输出中的“行”的值相同,如果第一个查询比第一个查询包含更多的过滤?应该不一样吗?显然我不太了解,所以有人可以向我解释一下吗?
提前感谢!
注意:这是mysql 5.5.56,表是InnoDB。
1)是,它按customers.store_id进行过滤,然后根据customer_id向后联接到发货表。
您可能可以通过用货件(customer_id,日期)替换货件(customer_id)上的索引来改善这一点,除非该索引已经涵盖了两个字段。
2)因为它是基于索引统计信息的估计,所以主要是每个索引的基数。