Oracle <> , != , ^= operators

后端 未结 4 1365
孤街浪徒
孤街浪徒 2020-12-03 08:24

I want to know the difference of those operators, mainly their performance difference.

I have had a look at Difference between <> and != in SQL, it has no perf

4条回答
  •  天涯浪人
    2020-12-03 09:09

    I have tested the performance of the different syntax for the not equal operator in Oracle. I have tried to eliminate all outside influence to the test.

    I am using an 11.2.0.3 database. No other sessions are connected and the database was restarted before commencing the tests.

    A schema was created with a single table and a sequence for the primary key

    CREATE TABLE loadtest.load_test (
      id NUMBER NOT NULL,
      a VARCHAR2(1) NOT NULL,
      n NUMBER(2) NOT NULL,
      t TIMESTAMP NOT NULL
    );
    
    CREATE SEQUENCE loadtest.load_test_seq
    START WITH 0
    MINVALUE 0;
    

    The table was indexed to improve the performance of the query.

    ALTER TABLE loadtest.load_test
    ADD CONSTRAINT pk_load_test
    PRIMARY KEY (id)
    USING INDEX;
    
    CREATE INDEX loadtest.load_test_i1
    ON loadtest.load_test (a, n);
    

    Ten million rows were added to the table using the sequence, SYSDATE for the timestamp and random data via DBMS_RANDOM (A-Z) and (0-99) for the other two fields.

    SELECT COUNT(*) FROM load_test;
    
    COUNT(*)
    ----------
    10000000
    
    1 row selected.
    

    The schema was analysed to provide good statistics.

    EXEC DBMS_STATS.GATHER_SCHEMA_STATS(ownname => 'LOADTEST', estimate_percent => NULL, cascade => TRUE);
    

    The three simple queries are:-

    SELECT a, COUNT(*) FROM load_test WHERE n <> 5 GROUP BY a ORDER BY a;
    
    SELECT a, COUNT(*) FROM load_test WHERE n != 5 GROUP BY a ORDER BY a;
    
    SELECT a, COUNT(*) FROM load_test WHERE n ^= 5 GROUP BY a ORDER BY a;
    

    These are exactly the same with the exception of the syntax for the not equals operator (not just <> and != but also ^= )

    First each query is run without collecting the result in order to eliminate the effect of caching.

    Next timing and autotrace were switched on to gather both the actual run time of the query and the execution plan.

    SET TIMING ON
    
    SET AUTOTRACE TRACE
    

    Now the queries are run in turn. First up is <>

    > SELECT a, COUNT(*) FROM load_test WHERE n <> 5 GROUP BY a ORDER BY a;
    
    26 rows selected.
    
    Elapsed: 00:00:02.12
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2978325580
    
    --------------------------------------------------------------------------------------
    | Id  | Operation             | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |              |    26 |   130 |  6626   (9)| 00:01:20 |
    |   1 |  SORT GROUP BY        |              |    26 |   130 |  6626   (9)| 00:01:20 |
    |*  2 |   INDEX FAST FULL SCAN| LOAD_TEST_I1 |  9898K|    47M|  6132   (2)| 00:01:14 |
    --------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - filter("N"<>5)
    
    
    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
          22376  consistent gets
          22353  physical reads
              0  redo size
            751  bytes sent via SQL*Net to client
            459  bytes received via SQL*Net from client
              3  SQL*Net roundtrips to/from client
              1  sorts (memory)
              0  sorts (disk)
             26  rows processed
    

    Next !=

    > SELECT a, COUNT(*) FROM load_test WHERE n != 5 GROUP BY a ORDER BY a;
    
    26 rows selected.
    
    Elapsed: 00:00:02.13
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2978325580
    
    --------------------------------------------------------------------------------------
    | Id  | Operation             | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |              |    26 |   130 |  6626   (9)| 00:01:20 |
    |   1 |  SORT GROUP BY        |              |    26 |   130 |  6626   (9)| 00:01:20 |
    |*  2 |   INDEX FAST FULL SCAN| LOAD_TEST_I1 |  9898K|    47M|  6132   (2)| 00:01:14 |
    --------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - filter("N"<>5)
    
    
    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
          22376  consistent gets
          22353  physical reads
              0  redo size
            751  bytes sent via SQL*Net to client
            459  bytes received via SQL*Net from client
              3  SQL*Net roundtrips to/from client
              1  sorts (memory)
              0  sorts (disk)
             26  rows processed
    

    Lastly ^=

    > SELECT a, COUNT(*) FROM load_test WHERE n ^= 5 GROUP BY a ORDER BY a;
    
    26 rows selected.
    
    Elapsed: 00:00:02.10
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2978325580
    
    --------------------------------------------------------------------------------------
    | Id  | Operation             | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |              |    26 |   130 |  6626   (9)| 00:01:20 |
    |   1 |  SORT GROUP BY        |              |    26 |   130 |  6626   (9)| 00:01:20 |
    |*  2 |   INDEX FAST FULL SCAN| LOAD_TEST_I1 |  9898K|    47M|  6132   (2)| 00:01:14 |
    --------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - filter("N"<>5)
    
    
    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
          22376  consistent gets
          22353  physical reads
              0  redo size
            751  bytes sent via SQL*Net to client
            459  bytes received via SQL*Net from client
              3  SQL*Net roundtrips to/from client
              1  sorts (memory)
              0  sorts (disk)
             26  rows processed
    

    The execution plan for the three queries is identical and the timings 2.12, 2.13 and 2.10 seconds.

    It should be noted that whichever syntax is used in the query the execution plan always displays <>

    The tests were repeated ten times for each operator syntax. These are the timings:-

    <>
    
    2.09
    2.13
    2.12
    2.10
    2.07
    2.09
    2.10
    2.13
    2.13
    2.10
    
    !=
    
    2.09
    2.10
    2.12
    2.10
    2.15
    2.10
    2.12
    2.10
    2.10
    2.12
    
    ^=
    
    2.09
    2.16
    2.10
    2.09
    2.07
    2.16
    2.12
    2.12
    2.09
    2.07
    

    Whilst there is some variance of a few hundredths of the second it is not significant. The results for each of the three syntax choices are the same.

    The syntax choices are parsed, optimised and are returned with the same effort in the same time. There is therefore no perceivable benefit from using one over another in this test.

    "Ah BC", you say, "in my tests I believe there is a real difference and you can not prove it otherwise".

    Yes, I say, that is perfectly true. You have not shown your tests, query, data or results. So I have nothing to say about your results. I have shown that, with all other things being equal, it doesn't matter which syntax you use.

    "So why do I see that one is better in my tests?"

    Good question. There a several possibilities:-

    1. Your testing is flawed (you did not eliminate outside factors - other workload, caching etc You have given no information about which we can make an informed decision)
    2. Your query is a special case (show me the query and we can discuss it).
    3. Your data is a special case (Perhaps - but how - we don't see that either).
    4. There is some other outside influence.

    I have shown via a documented and repeatable process that there is no benefit to using one syntax over another. I believe that <> != and ^= are synonymous.

    If you believe otherwise fine, so

    a) show a documented example that I can try myself

    and

    b) use the syntax which you think is best. If I am correct and there is no difference it won't matter. If you are correct then cool, you have an improvement for very little work.

    "But Burleson said it was better and I trust him more than you, Faroult, Lewis, Kyte and all those other bums."

    Did he say it was better? I don't think so. He didn't provide any definitive example, test or result but only linked to someone saying that != was better and then quoted some of their post.

    Show don't tell.

提交回复
热议问题