Postgres uses wrong index in query plan

痴心易碎 提交于 2019-12-24 00:59:02

问题


Below I have 2 almost identical queries, only the limit is different. Nevertheless, the query plan and execution time are totally different. The first query is +300 times slower than the second one.

The problem only occurs for a small number of owner_ids. Owners with many routes (+1000), none of which has recently been edited. The table route contains 2,806,976 rows. The owner in the example has 4,510 routes.

The database is hosted on Amazon RDS on a server with 34.2 GiB memory, 4vCPU and provisioned IOPS (instance type db.m2.2xlarge).

EXPLAIN ANALYZE SELECT
    id
FROM
    route
WHERE
    owner_id = 39127
ORDER BY
    edited_date DESC
LIMIT
    5

Query plan:
"Limit  (cost=0.43..5648.85 rows=5 width=12) (actual time=1.046..12949.436 rows=5 loops=1)"
"  ->  Index Scan Backward using route_i_edited_date on route  (cost=0.43..5368257.28 rows=4752 width=12) (actual time=1.042..12949.418 rows=5 loops=1)"
"        Filter: (owner_id = 39127)"
"        Rows Removed by Filter: 2351712"
"Total runtime: 12949.483 ms"

EXPLAIN ANALYZE SELECT
    id
FROM
    route
WHERE
    owner_id = 39127
ORDER BY
    edited_date DESC
LIMIT
    15

Query plan:
"Limit  (cost=13198.79..13198.83 rows=15 width=12) (actual time=37.781..37.821 rows=15 loops=1)"
"  ->  Sort  (cost=13198.79..13210.67 rows=4752 width=12) (actual time=37.778..37.790 rows=15 loops=1)"
"        Sort Key: edited_date"
"        Sort Method: top-N heapsort  Memory: 25kB"
"        ->  Index Scan using route_i_owner_id on route  (cost=0.43..13082.20 rows=4752 width=12) (actual time=0.039..32.425 rows=4510 loops=1)"
"              Index Cond: (owner_id = 39127)"
"Total runtime: 37.870 ms"

How can I ensure that Postgres uses the index route_i_owner_id.

I already tried the following things:

  • increasing statistics for edited_date and owner_id

    ALTER TABLE route ALTER COLUMN owner_id SET STATISTICS 1000;
    ALTER TABLE route ALTER COLUMN edited_date SET STATISTICS 1000;
    
  • vacuum analyse of whole database

Solved with following composite index:

CREATE INDEX route_i_owner_id_edited_date
  ON public.route
  USING btree
  (owner_id, edited_date DESC);

EXPLAIN ANALYZE SELECT
    id
FROM
    route
WHERE
    owner_id = 39127
ORDER BY
    edited_date DESC
LIMIT
    5

"Limit  (cost=0.43..16.99 rows=5 width=12) (actual time=0.028..0.050 rows=5 loops=1)"
"  ->  Index Scan using route_i_owner_id_edited_date on route  (cost=0.43..15746.74 rows=4753 width=12) (actual time=0.025..0.039 rows=5 loops=1)"
"        Index Cond: (owner_id = 39127)"
"Total runtime: 0.086 ms"

回答1:


This query is to slow to begin with. It should take less than 1s.

Your first example uses the edited_date index to sort the data first, then filter the sorted data.

Your second example, sorts the data (without index, it seems), then applies an index scan to fetch the actual rows. Both approaches seems bad.

What would probably speed it up, is a composite index of both owner_id and edited_date, which would make sense if this kind of query is used often. This index would also replace one of the other indexes, and perhaps even both.



来源:https://stackoverflow.com/questions/28740639/postgres-uses-wrong-index-in-query-plan

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!