Performance of SQL “EXISTS” usage variants

后端 未结 9 675
-上瘾入骨i
-上瘾入骨i 2020-12-05 04:39

Is there any difference in the performance of the following three SQL statements?

SELECT * FROM tableA WHERE EXISTS (SELECT * FROM tableB WHERE tableA.x = ta         


        
相关标签:
9条回答
  • 2020-12-05 05:21

    The EXISTS returns a boolean not actual data, that said best practice is to use #3.

    0 讨论(0)
  • 2020-12-05 05:23

    In SQL Server at least,

    The smallest amount of data that can be read from disk is a single "page" of disk space. As soon as the processor reads one record that satisfies the subquery predicates it can stop. The subquery is not executed as though it was standing on it's own, and then included in the outer query, it is executed as part of the complete query plan for the whole thing. So when used as a subquery, it really doesn't matter what is in the Select clause, nothing is returned" to the outer query anyway, except a boolean to indicate whether a single record was found or not...

    All three use the exact same execution plan

    I always use [Select * From ... ] as I think it reads better, by not implying that I want something in particular returned from the subquery.

    EDIT: From dave costa comment... Oracle also uses the same execution plan for all three options

    0 讨论(0)
  • 2020-12-05 05:27

    This is one of those questions that verges on initiating some kind of holy war.

    There's a fairly good discussion about it here.

    I think the answer is probably to use the third option, but the speed increase is so infinitesimal it's really not worth worrying about. It's easily the kind of query that SQL Server can optimise internally anyway, so you may find that all options are equivalent.

    0 讨论(0)
  • 2020-12-05 05:29

    The truth about the EXISTS clause is that the SELECT clause is not evaluated in an EXISTS clause - you could try:

    SELECT * 
      FROM tableA 
     WHERE EXISTS (SELECT 1/0 
                     FROM tableB 
                    WHERE tableA.x = tableB.y)
    

    ...and should expect a divide by zero error, but you won't because it's not evaluated. This is why my habit is to specify NULL in an EXISTS to demonstrate that the SELECT can be ignored:

    SELECT * 
      FROM tableA 
     WHERE EXISTS (SELECT NULL
                     FROM tableB 
                    WHERE tableA.x = tableB.y)
    

    All that matters in an EXISTS clause is the FROM and beyond clauses - WHERE, GROUP BY, HAVING, etc.

    This question wasn't marked with a database in mind, and it should be because vendors handle things differently -- so test, and check the explain/execution plans to confirm. It is possible that behavior changes between versions...

    0 讨论(0)
  • 2020-12-05 05:35

    In addition to what others have said, the practice of using SELECT 1 originated on old Microsoft SQL Server (prior 2005) - its query optimizer wasn't clever enough to avoid physically fetching fields from the table for SELECT *. No other DBMS, to my knowledge, has this deficiency.

    The EXISTS tests for existence of rows, not what's in them, so other than some optimizer quirk similar to above, it doesn't really matter what's in the SELECT list.

    The SELECT * seems to be most usual, but others are acceptable as well.

    0 讨论(0)
  • 2020-12-05 05:39

    #3 Should be the best one, as you don´t need the returned data anyway. Bringing the fields will only add an extra overhead

    0 讨论(0)
提交回复
热议问题