Postgres LIKE '…%' doesn't use index

久未见 提交于 2020-07-10 03:16:07

问题


I have a table in which I want to search by a prefix of the primary key. The primary key has values like 03.000221.1, 03.000221.2, 03.000221.3, etc. and I want to retrieve all that begin with 03.000221..

My first thought was to filter with LIKE '03.000221.%', thinking Postgres would be smart enough to look up 03.000221. in the index and perform a range scan from that point. But no, this performs a sequential scan.

                                                   QUERY PLAN                                                    
-----------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..253626.34 rows=78 width=669)
   Workers Planned: 2
   ->  Parallel Seq Scan on ...  (cost=0.00..252618.54 rows=32 width=669)
         Filter: ((id ~~ '03.000221.%'::text)
 JIT:
   Functions: 2
   Options: Inlining false, Optimization false, Expressions true, Deforming true

If I do an equivalent operation using a plain >= and < range, e. g. id >= '03.000221.' and id < '03.000221.Z' it does use the index:

                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using ... on ...  (cost=0.56..8.58 rows=1 width=669)
   Index Cond: ((id >= '03.000221.'::text) AND (id < '03.000221.Z'::text))

But this is dirtier and it seems to me that Postgres should be able to deduce it can do an equivalent index range lookup with LIKE. Why doesn't it?


回答1:


PostgreSQL will do this if you are build the index with text_pattern_ops operator, or if you are using the C collation.

If you are using some random other collation, PostgreSQL can't deduce much of anything about it. Observe this, in the very common "en_US.utf8" collation.

select * from (values ('03.000221.1'), ('03.0002212'), ('03.000221.3')) f(x) order by x;
      x      
-------------
 03.000221.1
 03.0002212
 03.000221.3

Which then naturally leads to this wrong answer with your query:

select * from (values ('03.000221.1'), ('03.0002212'), ('03.000221.3')) f(id)
    where ((id >= '03.000221.'::text) AND (id < '03.000221.Z'::text))
     id      
-------------
 03.000221.1
 03.0002212
 03.000221.3


来源:https://stackoverflow.com/questions/61422684/postgres-like-doesnt-use-index

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!