问题
sqlite> .schema movie
CREATE TABLE movie (
id INTEGER PRIMARY KEY, title TEXT, year INTEGER, nth TEXT, for_video BOOLEAN
);
sqlite> select count(*) from movie;
count(*)
----------
530256
sqlite>
For query, Which years in human history have seen at least one movie released?
$ sqlite movie.db
sqlite> select DISTINCT year from movie order by year asc;
sqlite> select * from sqlite_master where type='index';
index|index1|movie|194061|CREATE INDEX index1 on movie (year)
sqlite> .quit
$
$ time printf "select DISTINCT year from movie order by year;" | sqlite3 movie.db > /dev/null
real 0m0.086s
user 0m0.064s
sys 0m0.020s
$ time printf "select DISTINCT year from movie indexed by index1 order by year;" | sqlite3 movie.db > /dev/null
real 0m0.092s
user 0m0.088s
sys 0m0.000s
My understanding is, to run select query on movie, without indexing, 530256 scans required, because table movie has 530256 records. To reduce these scans, index1 is created on table movieusing non-key field year.
On indexing, things are going worse.
Using indexing, Can select query with DISTINCT be optimised?
Does indexing enhance performance for sql query only with WHERE clause and without GROUP BY, that provides specific(single) group of results?
来源:https://stackoverflow.com/questions/47928267/running-distinct-statement-on-index-sqlite