Is there any general rule on SQL query complexity Vs performance?

后端 未结 4 1596
独厮守ぢ
独厮守ぢ 2020-12-07 18:10

1)Are SQL query execution times O(n) compared to the number of joins, if indexes are not used? If not, what kind of relationship are we likely to expect? And can indexing im

4条回答
  •  离开以前
    2020-12-07 18:13

    This depends on the query plan used.

    Even without indexes, modern servers can use HASH JOIN and MERGE JOIN which are faster than O(N * M)

    More specifically, complexity of a HASH JOIN is O(N + M), where N is the hashed table and M the is lookup table. Hashing and hash lookups have constant complexity.

    Complexity of a MERGE JOIN is O(N*Log(N) + M*Log(M)): it's the sum of times to sort both tables plus time to scan them.

    SELECT  T1.name, T2.date
    FROM    T1, T2
    WHERE   T1.id=T2.id
            AND T1.color='red'
            AND T2.type='CAR'
    

    If there are no indexes defined, the engine will select either a HASH JOIN or a MERGE JOIN.

    The HASH JOIN works as follows:

    1. The hashed table is chosen (usually it's the table with fewer records). Say it's t1

    2. All records from t1 are scanned. If the records holds color='red', this record goes into the hash table with id as a key and name as a value.

    3. All records from t2 are scanned. If the record holds type='CAR', its id is searched in the hash table and the values of name from all hash hits are returned along with the current value of data.

    The MERGE JOIN works as follows:

    1. The copy of t1 (id, name) is created, sorted on id

    2. The copy of t2 (id, data) is created, sorted on id

    3. The pointers are set to the minimal values in both tables:

      >1  2<
       2  3
       2  4
       3  5
      
    4. The pointers are compared in a loop, and if they match, the records are returned. If they don't match, the pointer with the minimal value is advanced:

      >1  2<  - no match, left pointer is less. Advance left pointer
       2  3
       2  4
       3  5
      
       1  2<  - match, return records and advance both pointers
      >2  3
       2  4
       3  5
      
       1  2  - match, return records and advance both pointers
       2  3< 
       2  4
      >3  5
      
       1  2 - the left pointer is out of range, the query is over.
       2  3
       2  4<
       3  5
      >
      

    In such a case, making the query more complex could make it faster because less rows are subjected to the join-level tests?

    Sure.

    Your query without the WHERE clause:

    SELECT  T1.name, T2.date
    FROM    T1, T2
    

    is more simple but returns more results and runs longer.

提交回复
热议问题