How to use server side cursors with psycopg2

梦想的初衷 提交于 2019-12-04 01:17:10

Additionally to cur.fetchmany(n) you can use PostgreSQL cursors:

cur.execute("declare foo cursor for select * from generate_series(1,1000000)")
cur.execute("fetch forward 100 from foo")
rows = cur.fetchall()
# ...
cur.execute("fetch forward 100 from foo")
rows = cur.fetchall()
# and so on

Psycopg2 has a nice interface for working with server side cursors. This is a possible template to use:

with psycopg2.connect(database_connection_string) as conn:
    with conn.cursor(name='name_of_cursor') as cursor:

        cursor.itersize = 20000

        query = "SELECT * FROM ..."
        cursor.execute(query)

        for row in cursor:
            # process row 

The code above creates the connection and automatically places the query result into a server side cursor. The value itersize sets the number of rows that the client will pull down at a time from the server side cursor. The value you use should balance number of network calls versus memory usage on the client. For example, if your result count is three million, an itersize value of 2000 (the default value) will result in 1500 network calls. If the memory consumed by 2000 rows is light, increase that number.

When using for row in cursor you are of course working with one row at a time, but Psycopg2 will prefetch itersize rows at a time for you.

If you want to use fetchmany for some reason, you could do something like this:

while True:
    rows = cursor.fetchmany(100)
    if len(rows) > 0:
        for row in rows:
            # process row
    else:
        break

This usage of fetchmany will not trigger a network call to the server for more rows until the prefetched batch has been exhausted. (This is a convoluted example that provides nothing over the code above, but demonstrates how to use fetchmany should there be a need.)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!