Preserving the order of records from subquery while using “Union distinct” construct

狂风中的少年 提交于 2019-12-11 13:39:01

问题


I want to make sure that the order of the result from subquery are preserved while using Union distinct. Please note that "union distinct" is required to filter on duplicates while doing the union.

For example:

select columnA1, columnA2 from tableA order by [columnA3] asc
union distinct
select columnB1, columnB2 from tableB

When I run this, I am expecting that the records ordered from subquery ( select columnA1, columnA2 from tableA sort by [columnA3] asc) comes in first (as returned by order by columnA3 asc) followed by those from tableB.

I am assuming that I cannot add another dummy column because that would make union distinct to not work. So, this won't work:

select column1, column2 from 
 ( select column1, column2, 1 as ORD from tableA order by [columnA3] asc
 union distinct
 select column1, column2, 2 as ORD from tableB 
 ) order by ORD

回答1:


Essentially, MySQL isn’t preserving the order of records from sub-query while using “Union distinct” construct. After a bit of research, I found that it works if we put in a limit clause or have nested queries. So, below are the two approaches:

Approach-1: Use Limit clause

         select columnA1, columnA2 from tableA order by [columnA3] asc Limit 100000000
         union distinct
         select columnB1, columnB2 from tableB

I have tested this behavior using few datasets and it seems to work consistently. Also, there is a reference to this behavior in MySQL‘s documentation ( http://dev.mysql.com/doc/refman/5.1/en/union.html ): “Use of ORDER BY for individual SELECT statements implies nothing about the order in which the rows appear in the final result because UNION by default produces an unordered set of rows. Therefore, the use of ORDER BY in this context is typically in conjunction with LIMIT, so that it is used to determine the subset of the selected rows to retrieve for the SELECT, even though it does not necessarily affect the order of those rows in the final UNION result. If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway.”

Please note that there is no particular reason in choosing LIMIT of 10000000000 other than having a sufficiently high number to make sure we cover all cases.

Approach-2: A nested query like the one below also works.

        select column1, column2 from 
        ( select column1, column2 order by [columnA3] asc ) alias1
        union distinct
        ( select column1, column2 from tableB )

I couldn’t find a reason for nested query to work. There have being some references online (like the one from Phil McCarley at http://dev.mysql.com/doc/refman/5.0/en/union.html ) but no official documentation from MySQL.




回答2:


select column1, column2 from 
 ( select column1, column2, 1 as ORD from tableA
 union distinct
 select tableB.column1, tableB.column2, 2 as ORD from tableB 
  LEFT JOIN tableA
      ON tableA.column1 = tableB.column1 AND tableA.column2 = tableB.column2
  WHERE tableA.column1 IS NULL
 ) order by ORD

note that UNION not only de-dupes across the separate sets, but within sets

Alternatively:

select column1, column2 from 
 ( select column1, column2, 1 as ORD from tableA
 union distinct
 select column1, column2, 2 as ORD from tableB 
 WHERE (column1, column2) NOT IN (SELECT column1, column2 from tableA)
 ) order by ORD


来源:https://stackoverflow.com/questions/7560091/preserving-the-order-of-records-from-subquery-while-using-union-distinct-const

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!