Performance Tuning : Create index for boolean column

我的未来我决定 提交于 2019-11-29 01:44:49

问题


I have written a daemon processor which will fetch the records from one database and insert them into another database for synchronizing. It will fetch records based on each record indication flag which is boolean datatype.

My tables has hundreds of thousands of records. When I select the record whichever sync_done is false, will it cause any database performance issues? Or should I apply indexing for that sync_done column (boolean datatype), to improve performance, since it will apply select operation on records with a sync_done value of false?

For example, say I have 10000 records. Of those, 9500 have already been synchronized (sync_done is true), will select only rest of the record (sync_done is false). Ultimately 9500 records would not come under the select operation.

Please suggest how I might proceed.


回答1:


For a query like this a partial index would serve you best.

CREATE INDEX ON tbl (id) WHERE sync_done = FALSE

However, for a use case like this, other synchronization methods may be preferable.

  • Have a look at LISTEN / NOTIFY.
  • Or use a trigger in combination with dblink.
  • Or one of the many available replication methods.



回答2:


I suggest that you do not index the table (the boolean is a low cardinality field), but partition it instead on the boolean value.

See: http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html




回答3:


A table with records and a boolean field should be the way to do it.

Here is something which I believe might help you...

Bitmap Index

Alternative of Bitmap Index in PostgreSQL




回答4:


An index will certainly help but rather than polling which can impose load and concurrency issues if your database is heavily used it might be worth considering a notification method such as amqp or trigger/database queue based approach instead like Slony or Skytools Londiste. I have used both Slony and Londiste for trigger based replication and have found both excellent. My preference is for Londiste as it is much simpler to set up and manage (and if you have a simple use case stick to the older 2. branch).



来源:https://stackoverflow.com/questions/12025094/performance-tuning-create-index-for-boolean-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!