问题
The following two SQL statements are functionally identical:
SELECT DISTINCT a,b,c FROM table1
UNION DISTINCT
SELECT DISTINCT a,b,c FROM table2
and
SELECT a,b,c FROM table1
UNION DISTINCT
SELECT a,b,c FROM table2
...because "DISTINCT" is applied to the union as a whole, and so is redundant within the individual SELECT
's.
(NOTE: UNION DISTINCT
is identical to just UNION
by itself, but I included the DISTINCT
keyword for clarity)
My question here is, is there a performance difference, or execution-plan difference between the two in MySQL? Or are the SELECT DISTINCT
s turned into regular SELECT
's by the optimizer?
回答1:
You need to check the execution plans. However, I would expect that the execution plans are different -- or at least they should be in some circumstances.
The first query:
SELECT DISTINCT a, b, c FROM table1
UNION DISTINCT
SELECT DISTINCT a, b, c FROM table2
can readily take advantage of indexes on table1(a, b, c)
and table2(a, b, c)
before doing the final UNION
. This should speed the final union by reducing the size of the data. The second query doesn't have this advantage.
In fact, the most efficient way to write this query would probably be to have the two indexes and use:
SELECT DISTINCT a, b, c FROM table1 t1
UNION ALL
SELECT DISTINCT a, b, c
FROM table2 t2
WHERE NOT EXISTS (SELECT 1 FROM table1 t1 WHERE t2.a = t1.a and t2.b = t1.b and t2.c = t1.c)
This is almost identical, although it might handle NULL
values in the second table a bit differently.
来源:https://stackoverflow.com/questions/30142553/combining-select-distinct-with-union-distinct-in-mysql-any-effect