What is the fastest way to apply 150M updates on PostgreSQL table

匿名 (未验证) 提交于 2019-12-03 10:10:24

问题:

We have a file of 150M lines which updates only one table of postgresql database with such commands:

UPDATE "events" SET "value_1" = XX, "value_2" = XX, "value_3" = XX, "value_4" = XX WHERE "events"."id" = SOME_ID; 

All id's are unique, there's no way to apply that update to several events. Currently such update takes approx few days if we run this with \i update.sql in psql.

Is there any faster way to run it?

回答1:

  • Simplest: add set synchronous_commit=off before \i update.sql

  • Better:

    • Split the file to parts of like 100000 updates:
      split -l 100000 -a 6 --additional-suffix=.sql update.sql update-part
    • Run these updates in parallel, each file in single transaction, for example with:
      /bin/ls update-part*.sql \ | xargs --max-procs=8 --replace psql --single-transaction --file={}


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!