Pandas iterate over DataFrame row pairs

前端 未结 4 2001
执笔经年
执笔经年 2020-12-10 21:35

How can I iterate over pairs of rows of a Pandas DataFrame?

For example:

content = [(1,2,[1,3]),(3,4,[2,4]),(5,6,[6,9]),(7,8,[9,10])]
df = pd.DataFra         


        
相关标签:
4条回答
  • 2020-12-10 21:48

    To get the output you've shown use:

    for row in df.index[:-1]:
        print 'row 1:'
        print df.iloc[row].squeeze()
        print 'row 2:'
        print df.iloc[row+1].squeeze()
        print
    
    0 讨论(0)
  • 2020-12-10 21:58

    shift the dataframe & concat it back to the original using axis=1 so that each interval & the next interval are in the same row

    df_merged = pd.concat([df, df.shift(-1).add_prefix('next_')], axis=1)
    df_merged
    #Out:
       a  b interval     next_a     next_b    next_interval
    0  1  2   [1, 3]        3.0        4.0           [2, 4]
    1  3  4   [2, 4]        5.0        6.0           [6, 9]
    2  5  6   [6, 9]        7.0        8.0          [9, 10]
    3  7  8  [9, 10]        NaN        NaN              NaN
    

    define an intersects function that works with your lists representation & apply on the merged data frame ignoring the last row where the shifted_interval is null

    def intersects(left, right):
        return left[1] > right[0]
    
    df_merged[:-1].apply(lambda x: intersects(x.interval, x.next_interval), axis=1)
    #Out:
    0     True
    1    False
    2    False
    dtype: bool
    
    0 讨论(0)
  • 2020-12-10 22:01

    You could try the iloc indexing.

    Exmaple:

    for i in range(df.shape[0] - 1):                        
        idx1,idx2=i,i+1                         
        row1,row2=df.iloc[idx1],df.iloc[idx2]   
        print(row1)                             
        print(row2)                             
        print()                                                                            
    
    0 讨论(0)
  • 2020-12-10 22:06

    If you want to keep the loop for, using zip and iterrows could be a way

    for (indx1,row1),(indx2,row2) in zip(df[:-1].iterrows(),df[1:].iterrows()):
        print "row1:\n", row1
        print "row2:\n", row2
        print "\n"
    

    To access the next row at the same time, start the second iterrow one row after with df[1:].iterrows(). and you get the output the way you want.

    row1:
    a    1
    b    2
    Name: 0, dtype: int64
    row2:
    a    3
    b    4
    Name: 1, dtype: int64
    
    
    row1:
    a    3
    b    4
    Name: 1, dtype: int64
    row2:
    a    5
    b    6
    Name: 2, dtype: int64
    
    
    row1:
    a    5
    b    6
    Name: 2, dtype: int64
    row2:
    a    7
    b    8
    Name: 3, dtype: int64
    

    But as said @RafaelC, doing for loop might not be the best method for your general problem.

    0 讨论(0)
提交回复
热议问题