Convert list of pyodbc.rows to pandas Dataframe takes very long time

孤者浪人 提交于 2020-01-14 19:49:49

问题


Is there a faster way to convert pyodbc.rows object to pandas Dataframe? It take about 30-40 minutes to convert a list of 10 million+ pyodbc.rows objects to pandas dataframe.

import pyodbc
import pandas

server = <server_ip> 
database = <db_name> 
username = <db_user> 
password = <password> 
port='1443'

conn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';PORT='+port+';DATABASE='+database+';UID='+username+';PWD='+ password)

#takes upto 12 minutes
rows = cursor.execute("select top 10000000 * from [LSLTGT].[MBR_DIM] ").fetchall() 

#Read cursor data into Pandas dataframe.....Takes forever!
df = pandas.DataFrame([tuple(t) for t in rows]) 

回答1:


You might get some improvement by using a generator expression rather than a list comprehension:

df = pandas.DataFrame((tuple(t) for t in rows)) 


来源:https://stackoverflow.com/questions/53486051/convert-list-of-pyodbc-rows-to-pandas-dataframe-takes-very-long-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!