Psycopg2, Postgresql, Python: Fastest way to bulk-insert

前端 未结 8 1375
轻奢々
轻奢々 2020-11-30 22:48

I\'m looking for the most efficient way to bulk-insert some millions of tuples into a database. I\'m using Python, PostgreSQL and psycopg2.

I have created a long lis

8条回答
  •  眼角桃花
    2020-11-30 22:55

    A very related question: Bulk insert with SQLAlchemy ORM


    All Roads Lead to Rome, but some of them crosses mountains, requires ferries but if you want to get there quickly just take the motorway.


    In this case the motorway is to use the execute_batch() feature of psycopg2. The documentation says it the best:

    The current implementation of executemany() is (using an extremely charitable understatement) not particularly performing. These functions can be used to speed up the repeated execution of a statement against a set of parameters. By reducing the number of server roundtrips the performance can be orders of magnitude better than using executemany().

    In my own test execute_batch() is approximately twice as fast as executemany(), and gives the option to configure the page_size for further tweaking (if you want to squeeze the last 2-3% of performance out of the driver).

    The same feature can easily be enabled if you are using SQLAlchemy by setting use_batch_mode=True as a parameter when you instantiate the engine with create_engine()

提交回复
热议问题