Slow select distinct query on postgres

后端 未结 4 1475
自闭症患者
自闭症患者 2020-12-30 09:55

I\'m doing the following two queries quite frequently on a table that essentially gathers up logging information. Both select distinct values from a huge number of rows but

4条回答
  •  北荒
    北荒 (楼主)
    2020-12-30 10:54

    You're selecting distinct values from the whole table, which automatically leads to a seq scan. You've millions rows, so it'll necessarily be slow.

    There's a trick to get the distinct values faster, but it only works when the data has a known (and reasonably small) set of possible values. For instance, I take it that your bundle_id references some kind of bundles table which is a smaller. This means you can write:

    select bundles.bundle_id
    from bundles
    where exists (
          select 1 from audit_records
          where audit_records.bundle_id = bundles.bundle_id
          );
    

    This should lead to a nested loop / seq scan on bundles -> index scan on audit_records using the index on bundle_id.

提交回复
热议问题