I have a large dataframe (several million rows).
I want to be able to do a groupby operation on it, but just grouping by arbitrary consecutive (preferably equal-size
A sign of a good environment is many choices, so I'll add this from Anaconda Blaze, really using Odo
import blaze as bz
import pandas as pd
df = pd.DataFrame({'col1':[1,2,3,4,5], 'col2':[2,4,6,8,10]})
for chunk in bz.odo(df, target=bz.chunks(pd.DataFrame), chunksize=2):
# Do stuff with chunked dataframe