我正在寻找有效的策略来优化在涉及大数据集的场景中生成笛卡尔积的 MySQL 查询。具体用例涉及在数据库中为每部电影配对演员和工作人员,由于涉及的数据大小,这可能会导致大量组合。
当前查询如下:
SELECT
m.title,
pc.person_name AS cast_member,
pr.person_name AS crew_member
FROM
movie m
JOIN
movie_cast mc ON m.movie_id = mc.movie_id
JOIN
person pc ON mc.person_id = pc.person_id
JOIN
movie_crew mcc ON m.movie_id = mcc.movie_id
JOIN
person pr ON mcc.person_id = pr.person_id;
我考虑过使用带有 LATERAL 子句的派生表来提高效率,但这会导致结果不准确。 MySQL 中是否还有其他优化技术或方法可以提高此查询的性能?
请注意,此查询纯粹用于研究目的。
主要思想是与同一部电影相关的演员和工作人员必须加入一排。
参见示例
select m.movie_id,title
,m1.person_name cast_member
,m2.person_name crew_member
from movie m
left join
(
select movie_id
,min(case when pt='cast' then person_id end) cast_id
,min(case when pt='crew' then person_id end) crew_id
from(
select movie_id,person_id, 'cast' as pt from movie_cast
union all
select movie_id,person_id, 'crew' as pt from movie_crew
)cast_and_crew
group by movie_id,person_id
)pcc on pcc.movie_id=m.movie_id
left join person m1 on m1.person_id=pcc.cast_id
left join person m2 on m2.person_id=pcc.crew_id
order by m.movie_id,coalesce(m1.person_id,m2.person_id)