我正在尝试从两列中减去查询结果。
表:
id | word1 | lang1 | word2 | lang2 |
----+-----------+-------+-----------+-------+
1 | car | 1 | car | 15 |
2 | table | 1 | table | 15 |
3 | Chair | 1 | cahair | 13 |
4 | CDplayer | 15 | CDplayer | 1 |
5 | car | 1 | car | 13 |
我想获取语言 1 中 word1 中尚未翻译成语言 12 的所有单词。所以在这种情况下它将是 Chair
表中有 300 万行,以下查询运行需要 1 分钟:
SELECT DISTINCT word1
FROM `translations`
WHERE lang1 = 1
AND lang2 != 15
AND NOT IN (SELECT word1 FROM `translations` WHERE lang2 == 15)
LIMIT 10
分别对两行进行选择非常快 0.006 秒,然后我可以在 PHP 中使用
array_diff()
将它们相减,但可能有一种更简单的方法可以直接在 MySQL 中执行此操作。
SELECT
origin.word1
FROM
( SELECT DISTINCT word1
FROM tableX
WHERE lang1 = 1
) AS origin
WHERE
NOT EXISTS
( SELECT *
FROM tableX AS trans
WHERE trans.lang1 = 1
AND trans.lang2 = 15
AND trans.word1 = origin.word1
) ;
在运行这些查询之前,我会在
(lang1, word1)
上添加一个索引,在 (lang1, lang2, word1)
上添加一个索引。
您也可以尝试这种变体(并检查两个解释计划):
SELECT DISTINCT
word1
FROM
tableX AS origin
WHERE
lang1 = 1
AND
NOT EXISTS
( SELECT *
FROM tableX AS trans
WHERE trans.lang1 = 1
AND trans.lang2 = 15
AND trans.word1 = origin.word1
) ;
SELECT DISTINCT NonTranslated.word1
from
(SELECT DISTINCT word1 FROM `translations` WHERE lang1 = 1 AND lang2 != 15)NonTranslated
left join
(SELECT DISTINCT word1 FROM `translations` WHERE lang1 = 1 AND lang2 = 15)Translated
on NonTranslated.word1 = Translated.word1
where Translated.word1 is NULL;
让我知道这是什么解释。我认为它可能比选择子查询更快。
PS : 假设:即使这个词被翻译过一次,它也不会被包含在列表中。
SELECT
m.id ,
l.*
FROM mytable m
INNER JOIN (
SELECT
*
FROM mytable
WHERE lang2 != 15
GROUP BY id
) as l ON l.id = m.id
WHERE l.lang1 = 1
GROUP BY m.id
select word1
from translations t
group by word1
having max(lang1 = 1) = 1 and
max(lang2 = 15) = 0