Slow PostgreSQL query in production - help me understand this explain analyze output

若如初见. 提交于 2020-12-29 04:30:24

问题


I have a query that is taking 9 minutes to run on PostgreSQL 9.0.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit

This query is automatically generated by hibernate for my application. It's trying to find all of the "teacher members" in a school. A membership is a user with a role in a group. There are several types of groups, but here what matters are schools and services. If someone is a teacher member in a service and a member in this school (15499) then they are what we are looking for.

This query used to run fine in production and still runs fine in development, but in production it is now taking several minutes to run. Can you help me understand why?

Here's the query:

select distinct user1_.ID as ID14_, user1_.FIRST_NAME as FIRST2_14_, user1_.LAST_NAME as LAST3_14_, user1_.STREET_1 as STREET4_14_, user1_.STREET_2 as STREET5_14_, user1_.CITY as CITY14_, user1_.us_state_id as us7_14_, user1_.REGION as REGION14_, user1_.country_id as country9_14_, user1_.postal_code as postal10_14_, user1_.USER_NAME as USER11_14_, user1_.PASSWORD as PASSWORD14_, user1_.PROFESSION as PROFESSION14_, user1_.PHONE as PHONE14_, user1_.URL as URL14_, user1_.bio as bio14_, user1_.LAST_LOGIN as LAST17_14_, user1_.STATUS as STATUS14_, user1_.birthdate as birthdate14_, user1_.ageInYears as ageInYears14_, user1_.deleted as deleted14_, user1_.CREATEDATE as CREATEDATE14_, user1_.audit as audit14_, user1_.migrated2008 as migrated24_14_, user1_.creator as creator14_ 
from DIR_MEMBERSHIPS membership0_ 
inner join DIR_USERS user1_ on membership0_.USER_ID=user1_.ID, DIR_ROLES role2_, DIR_GROUPS group4_ 
where membership0_.role=role2_.ID 
and membership0_.GROUP_ID=group4_.id 
and membership0_.GROUP_ID=15499 
and case when membership0_.expires is null 
    then 1 
    else case when (membership0_.expires > CURRENT_TIMESTAMP and (membership0_.startDate is null or membership0_.startDate < CURRENT_TIMESTAMP)) 
        then 1 
        else 0 end 
    end =1 
and membership0_.deleted=false 
and role2_.deleted=false 
and role2_.NAME='ROLE_MEMBER' 
and group4_.deleted=false 
and user1_.STATUS='active' 
and user1_.deleted=false 
and (membership0_.USER_ID in (
    select membership7_.USER_ID 
    from DIR_MEMBERSHIPS membership7_, DIR_USERS user8_, DIR_ROLES role9_ 
    where membership7_.USER_ID=user8_.ID 
    and membership7_.role=role9_.ID 
    and case when membership7_.expires is null 
        then 1 
        else case when (membership7_.expires > CURRENT_TIMESTAMP 
                        and (membership7_.startDate is null or membership7_.startDate < CURRENT_TIMESTAMP)) 
            then 1 
            else 0 end 
        end =1 
    and membership7_.deleted=false 
    and role9_.NAME='ROLE_TEACHER_MEMBER'));

Explain analyze output:

 HashAggregate  (cost=61755.63..61755.64 rows=1 width=3334) (actual time=652504.302..652504.307 rows=4 loops=1)
   ->  Nested Loop  (cost=4355.35..61755.56 rows=1 width=3334) (actual time=304.450..652504.217 rows=6 loops=1)
     ->  Nested Loop  (cost=4355.35..61747.28 rows=1 width=3342) (actual time=304.419..652504.060 rows=6 loops=1)
           ->  Nested Loop Semi Join  (cost=4355.35..61738.97 rows=1 width=32) (actual time=304.385..652503.961 rows=6 loops=1)
                 Join Filter: (user_id = user_id)
                 ->  Nested Loop  (cost=0.00..32.75 rows=1 width=16) (actual time=0.190..26.703 rows=758 loops=1)
                       ->  Seq Scan on dir_roles role2_  (cost=0.00..1.25 rows=1 width=8) (actual time=0.032..0.038 rows=1 loops=1)
                             Filter: ((NOT deleted) AND ((name)::text = 'ROLE_MEMBER'::text))
                       ->  Index Scan using dir_memberships_role_group_id_index on dir_memberships membership0_  (cost=0.00..31.49 rows=1 width=24) (actual time=0.151..25.626 rows=758 loops=1)
                             Index Cond: ((role = role2_.id) AND (group_id = 15499))
                             Filter: ((NOT deleted) AND (CASE WHEN (expires IS NULL) THEN 1 ELSE CASE WHEN ((expires > now()) AND ((startdate IS NULL) OR (startdate < now()))) THEN 1 ELSE 0 END END = 1))
                 ->  Nested Loop  (cost=4355.35..61692.86 rows=1069 width=16) (actual time=91.088..843.967 rows=79986 loops=758)
                       ->  Nested Loop  (cost=4355.35..54185.33 rows=1069 width=8) (actual time=91.065..555.830 rows=79986 loops=758)
                             ->  Seq Scan on dir_roles role9_  (cost=0.00..1.25 rows=1 width=8) (actual time=0.006..0.013 rows=1 loops=758)
                                   Filter: ((name)::text = 'ROLE_TEACHER_MEMBER'::text)
                             ->  Bitmap Heap Scan on dir_memberships membership7_  (cost=4355.35..53983.63 rows=16036 width=16) (actual time=91.047..534.236 rows=79986 loops=758)
                                   Recheck Cond: (role = role9_.id)
                                   Filter: ((NOT deleted) AND (CASE WHEN (expires IS NULL) THEN 1 ELSE CASE WHEN ((expires > now()) AND ((startdate IS NULL) OR (startdate < now()))) THEN 1 ELSE 0 END END = 1))
                                   ->  Bitmap Index Scan on dir_memberships_role_index  (cost=0.00..4355.09 rows=214190 width=0) (actual time=87.050..87.050 rows=375858 loops=758)
                                         Index Cond: (role = role9_.id)
                       ->  Index Scan using dir_users_pkey on dir_users user8_  (cost=0.00..7.01 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=60629638)
                             Index Cond: (id = user_id)
           ->  Index Scan using dir_users_pkey on dir_users user1_  (cost=0.00..8.29 rows=1 width=3334) (actual time=0.011..0.011 rows=1 loops=6)
                 Index Cond: (id = user_id)
                 Filter: ((NOT deleted) AND ((status)::text = 'active'::text))
     ->  Index Scan using dir_groups_pkey on dir_groups group4_  (cost=0.00..8.28 rows=1 width=8) (actual time=0.023..0.023 rows=1 loops=6)
           Index Cond: (group4_.id = 15499)
           Filter: (NOT group4_.deleted)
Total runtime: 652504.827 ms
(29 rows)

I am reading and reading forum posts and the user manual, but I can't figure out what would make this run faster, except maybe if it were possible to make indexes for the select that uses the now() function.


回答1:


I rewrote your query and assume this will be faster:

SELECT u.id AS id14_, u.first_name AS first2_14_, u.last_name AS last3_14_, u.street_1 AS street4_14_, u.street_2 AS street5_14_, u.city AS city14_, u.us_state_id AS us7_14_, u.region AS region14_, u.country_id AS country9_14_, u.postal_code AS postal10_14_, u.user_name AS user11_14_, u.password AS password14_, u.profession AS profession14_, u.phone AS phone14_, u.url AS url14_, u.bio AS bio14_, u.last_login AS last17_14_, u.status AS status14_, u.birthdate AS birthdate14_, u.ageinyears AS ageinyears14_, u.deleted AS deleted14_, u.createdate AS createdate14_, u.audit AS audit14_, u.migrated2008 AS migrated24_14_, u.creator AS creator14_
FROM   dir_users u 
WHERE  u.status = 'active'
AND    u.deleted = FALSE
AND    EXISTS (
   SELECT 1
   FROM   dir_memberships m
   JOIN   dir_roles       r ON r.id = m.role
   JOIN   dir_groups      g ON g.id = m.group_id
   WHERE  m.group_id = 15499
   AND    m.user_id = u.id
   AND   (m.expires IS NULL
       OR m.expires > now() AND (m.startdate IS NULL OR m.startdate < now()))
   AND    m.deleted = FALSE
   AND    r.deleted = FALSE
   AND    r.name = 'ROLE_MEMBER'
   AND    g.deleted = FALSE
   )
AND    EXISTS (
    SELECT 1
    FROM   dir_memberships m
    JOIN   dir_roles       r ON r.id = m.role
    WHERE (m.expires IS NULL
        OR m.expires > now() AND (m.startDate IS NULL OR m.startDate < now()))
    AND    m.deleted = FALSE
    AND    m.user_id = u.id
    AND    r.name = 'ROLE_TEACHER_MEMBER'
    )

Rewrite with EXISTS

  • Replaced the weird case ... end = 1 expressions with simple expressions
  • Rewrote all JOINs with explicit join syntax to make it easier to read.
  • Transformed the big JOIN construct and the IN expression into two EXISTS semi-joins, which voids the necessity for DISTINCT. This should be quite a bit faster.
  • Lots of minor edits to make the query simpler, but they don't change the substance.
    Especially use simper aliases - what you had was noisy and confusing.

Indexes

If this isn't fast enough yet, and your write performance can deal with more indexes, add this partial multi-column index:

CREATE INDEX dir_memberships_g_id_u_id_idx ON dir_memberships (group_id, user_id)
WHERE  deleted = FALSE;

The WHERE conditions have to match your query for the index to be useful!

I assume that you already have primary keys and indexes on relevant foreign keys.

Further:

CREATE INDEX dir_memberships_u_id_role_idx ON dir_memberships (user_id, role)
WHERE  deleted = FALSE;

Why user_id a second time?. See:

  • Working of indexes in PostgreSQL
  • Is a composite index also good for queries on the first field?

Also, since user_id is already used in another index you are not blocking HOT-updates (which can only be used with columns not involved in any indexes.

Why role?
I assume both columns are of type integer (4 bytes). I have seen in your detailed question, that you run a 64 bit OS where MAXALIGN 8 bytes, so another integer will not make the index grow at all. I threw in role which might be useful for the second EXISTS semi-join.

If you have many "dead" users, this might also help:

CREATE INDEX dir_users_id_idx ON dir_users (id)
WHERE status = 'active' AND deleted = FALSE;

As always, check with EXPLAIN to see whether the indexes actually get used. You wouldn't want useless indexes consuming resources.

Are we fast yet?


Of course, all the usual advice for performance optimization applies, too.




回答2:


The query, minus the last 4 conditions, i.e.

and group4_.deleted=false 
and user1_.STATUS='active' 
and user1_.deleted=false 
and (membership0_.USER_ID in (...))

returns 758 rows. Each of these 758 rows will then go through the select membership7_.USER_ID ... subquery, which takes 843.967 miliseconds to run.

843.967 * 758 = 639726.986, there goes the 10 minutes.

As for tuning the query, I don't think you need DIR_USERS user8_ in the subquery. You can start by removing it, and also changing the subquery to use EXISTS instead of IN.

By the way, is the database being vacuumed? Even without any tuning, it doesn't look that complex a query or that much of data to require 10 minutes.



来源:https://stackoverflow.com/questions/11639323/slow-postgresql-query-in-production-help-me-understand-this-explain-analyze-ou

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!