I\'m using Pandas\' to_sql
function to write to MySQL, which is timing out due to large frame size (1M rows, 20 columns).
http://pandas.pydata.org/panda
Update: this functionality has been merged in pandas master and will be released in 0.15 (probably end of september), thanks to @artemyk! See https://github.com/pydata/pandas/pull/8062
So starting from 0.15, you can specify the chunksize
argument and e.g. simply do:
df.to_sql('table', engine, chunksize=20000)
There is beautiful idiomatic function chunks provided in answer to this question
In your case you can use this function like this:
def chunks(l, n):
""" Yield successive n-sized chunks from l.
"""
for i in xrange(0, len(l), n):
yield l.iloc[i:i+n]
def write_to_db(engine, frame, table_name, chunk_size):
for idx, chunk in enumerate(chunks(frame, chunk_size)):
if idx == 0:
if_exists_param = 'replace':
else:
if_exists_param = 'append'
chunk.to_sql(con=engine, name=table_name, if_exists=if_exists_param)
Only drawback that it doesn't support slicing second index in iloc function.