Does SQLite optimize a query with multiple AND conditions in the WHERE clause?

后端 未结 4 2145
礼貌的吻别
礼貌的吻别 2021-01-14 05:41

In SQL databases (I use Python+Sqlite), how to make sure that, if we have 1 million rows, the query

SELECT * FROM mytable WHERE myfunction(description) <          


        
4条回答
  •  轮回少年
    2021-01-14 06:29

    Inspired by @GordThompson's answer, here is a benchmark between:

    (1)  SELECT * FROM mytable WHERE col2 < 1000 AND myfunction(col1) < 500
    

    vs.

    (2)  SELECT * FROM mytable WHERE myfunction(col1) < 500 AND col2 < 1000
    

    Test (1) (easy-to-test condition first): 1.02 seconds

    import sqlite3, time, random
    
    def myfunc(x):
        time.sleep(0.001) # wait 1 millisecond for each call of this function
        return x
    
    # Create database
    db = sqlite3.connect(':memory:')
    db.create_function("myfunction", 1, myfunc)
    c = db.cursor()
    c.execute('CREATE TABLE mytable (col1 INTEGER, col2 INTEGER)');
    for i in range(10*1000):
        a = random.randint(0,1000)
        c.execute('INSERT INTO mytable VALUES (?, ?)', (a, i));
    
    # Do the evil query
    t0 = time.time()
    c.execute('SELECT * FROM mytable WHERE col2 < 1000 AND myfunction(col1) < 500')
    for e in c.fetchall():
        print e
    print "Elapsed time: %.2f" % (time.time() - t0)
    

    Result: 1.02 seconds, it means that myfunc has been called max 1000 times, i.e. not for all the 10k rows.


    Test (2) (Slow-to-compute condition first): 10.05 seconds

    Idem with:

    c.execute('SELECT * FROM mytable WHERE myfunction(col1) < 500 AND col2 < 1000')
    

    instead.

    Result: 10.05 seconds, it means that myfunc has been called ~ 10k times, i.e. for all the 10k rows, even those for which the condition col2 < 1000 is not True.


    Global conclusion: Sqlite does lazy evaluation for AND, i.e. the easy condition has to be written first like this:

    ... WHERE  AND 
    

提交回复
热议问题