python postgres can I fetchall() 1 million rows?

后端未结

关注

 5  946

北荒 2020-12-14 08:42

I am using psycopg2 module in python to read from postgres database, I need to some operation on all rows in a column, that has more than 1 million rows.

5条回答

星月不相逢 (楼主)

2020-12-14 09:08
The solution Burhan pointed out reduces the memory usage for large datasets by only fetching single rows:

row = cursor.fetchone()

However, I noticed a significant slowdown in fetching rows one-by-one. I access an external database over an internet connection, that might be a reason for it.

Having a server side cursor and fetching bunches of rows proved to be the most performant solution. You can change the sql statements (as in alecxe answers) but there is also pure python approach using the feature provided by psycopg2:
```
cursor = conn.cursor('name_of_the_new_server_side_cursor')
cursor.execute(""" SELECT * FROM table LIMIT 1000000 """)

while True:
    rows = cursor.fetchmany(5000)
    if not rows:
        break

    for row in rows:
        # do something with row
        pass
```
you find more about server side cursors in the psycopg2 wiki
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...