I\'m doing the following two queries quite frequently on a table that essentially gathers up logging information. Both select distinct values from a huge number of rows but
You're selecting distinct values from the whole table, which automatically leads to a seq scan. You've millions rows, so it'll necessarily be slow.
There's a trick to get the distinct values faster, but it only works when the data has a known (and reasonably small) set of possible values. For instance, I take it that your bundle_id references some kind of bundles table which is a smaller. This means you can write:
select bundles.bundle_id
from bundles
where exists (
select 1 from audit_records
where audit_records.bundle_id = bundles.bundle_id
);
This should lead to a nested loop / seq scan on bundles -> index scan on audit_records using the index on bundle_id.