Pandas iterating over multiple rows at once with overlap

前端未结

关注

 2  1024

旧时难觅i 2020-12-17 05:48

I have a pandas DataFrame that need to be fed in chunks of n-rows into downstream functions (print in the example). The chunks may have overlapping rows.

2条回答

暖寄归人 (楼主)

2020-12-17 06:14

Use DataFrame.groupby with integer division with helper 1d array created with same length like df - index values are not overlapped:

d = {'A':list(range(5)), 'B':list(range(5))}
df=pd.DataFrame(d)

print (np.arange(len(df)) // 2)
[0 0 1 1 2]

for i, g in df.groupby(np.arange(len(df)) // 2):
    print (g)

   A  B
0  0  0
1  1  1
   A  B
2  2  2
3  3  3
   A  B
4  4  4

EDIT:

For overlapping values is edited this answer:

def chunker1(seq, size):
    return (seq.iloc[pos:pos + size] for pos in range(0, len(seq)-1))

for i in chunker1(df,2):
    print (i)

   A  B
0  0  0
1  1  1
   A  B
1  1  1
2  2  2
   A  B
2  2  2
3  3  3
   A  B
3  3  3
4  4  4

0 讨论(0)

查看其它2个回答