Pandas iterating over multiple rows at once with overlap

前端 未结 2 1024
旧时难觅i
旧时难觅i 2020-12-17 05:48

I have a pandas DataFrame that need to be fed in chunks of n-rows into downstream functions (print in the example). The chunks may have overlapping rows.

<
2条回答
  •  暖寄归人
    2020-12-17 06:14

    Use DataFrame.groupby with integer division with helper 1d array created with same length like df - index values are not overlapped:

    d = {'A':list(range(5)), 'B':list(range(5))}
    df=pd.DataFrame(d)
    
    print (np.arange(len(df)) // 2)
    [0 0 1 1 2]
    
    for i, g in df.groupby(np.arange(len(df)) // 2):
        print (g)
    
       A  B
    0  0  0
    1  1  1
       A  B
    2  2  2
    3  3  3
       A  B
    4  4  4
    

    EDIT:

    For overlapping values is edited this answer:

    def chunker1(seq, size):
        return (seq.iloc[pos:pos + size] for pos in range(0, len(seq)-1))
    
    for i in chunker1(df,2):
        print (i)
    
       A  B
    0  0  0
    1  1  1
       A  B
    1  1  1
    2  2  2
       A  B
    2  2  2
    3  3  3
       A  B
    3  3  3
    4  4  4
    

提交回复
热议问题