Fetching huge data from Oracle in Python

怎甘沉沦 提交于 2019-12-06 16:21:57

问题


I need to fetch huge data from Oracle (using cx_oracle) in python 2.6, and to produce some csv file.

The data size is about 400k record x 200 columns x 100 chars each.

Which is the best way to do that?

Now, using the following code...

ctemp = connection.cursor()
ctemp.execute(sql)
ctemp.arraysize = 256
for row in ctemp:
  file.write(row[1])
  ...

... the script remain hours in the loop and nothing is writed to the file... (is there a way to print a message for every record extracted?)

Note: I don't have any issue with Oracle, and running the query in SqlDeveloper is super fast.

Thank you, gian


回答1:


You should use cur.fetchmany() instead. It will fetch chunk of rows defined by arraysise (256)

Python code:

def chunks(cur): # 256
    global log, d
    while True:
        #log.info('Chunk size %s' %  cur.arraysize, extra=d)
        rows=cur.fetchmany()

        if not rows: break;
        yield rows

Then do your processing in a for loop;

for i, chunk  in enumerate(chunks(cur)):
            for row in chunk:
                     #Process you rows here

That is exactly how I do it in my TableHunter for Oracle.




回答2:


  • add print statements after each line
  • add a counter to your loop indicating progress after each N rows
  • look into a module like 'progressbar' for displaying a progress indicator



回答3:


I think your code is asking the database for the data one row at the time which might explain the slowness.

Try:

ctemp = connection.cursor()
ctemp.execute(sql)
Results = ctemp.fetchall()
for row in Results:
    file.write(row[1])


来源:https://stackoverflow.com/questions/19243571/fetching-huge-data-from-oracle-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!