Postgres - Is this the right way to create a partial index on a boolean column?

自作多情 提交于 2019-12-03 14:13:21

I've confirmed the index works as expected.

I re-created the random data, only this time set diet_glutenfree to random() > 0.9 so there's only a 10% chance of an on bit.

I then re-created the indexes and tried the query again.

SELECT RecipeId from RecipeMetadata where diet_glutenfree;

Returns:

'Index Scan using idx_recipemetadata_glutenfree on recipemetadata  (cost=0.00..135.15 rows=1030 width=16)'
'  Index Cond: (diet_glutenfree = true)'

And:

SELECT RecipeId from RecipeMetadata where NOT diet_glutenfree;

Returns:

'Seq Scan on recipemetadata  (cost=0.00..214.26 rows=8996 width=16)'
'  Filter: (NOT diet_glutenfree)'

It seems my first attempt was polluted since PG estimates it's faster to scan the whole table rather than hit the index if it has to load over half the rows anyway.

However, I think I would get these exact results on a full index of the column. Is there a way to verify the number of rows indexed in a partial index?

UPDATE

The index is around 40k. I created a full index of the same column and it's over 200k, so it looks like it's definitely partial.

An index on a one-bit field makes no sense. For understanding the decisions made by the planner, you must think in terms of pages, not in terms of rows.

For 8K pages and an (estinated) rowsize of 80, there are 100 rows on every page. Assuming a random distribution, the chance that a page consist of only rows with a true value is neglectable, pow (0.5, 100), about 1e-33, IICC. (and the same for 'false' of course) Thus for a query on gluten_free == true, every page has to be fetched anyway, and filtered afterwards. Using an index would only cause more pages (:the index) to be fetched.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!