Fastest way to sort each row in a pandas dataframe

后端 未结 5 2243
温柔的废话
温柔的废话 2020-12-01 21:06

I need to find the quickest way to sort each row in a dataframe with millions of rows and around a hundred columns.

So something like this:

A   B   C         


        
5条回答
  •  臣服心动
    2020-12-01 21:41

    I think I would do this in numpy:

    In [11]: a = df.values
    
    In [12]: a.sort(axis=1)  # no ascending argument
    
    In [13]: a = a[:, ::-1]  # so reverse
    
    In [14]: a
    Out[14]:
    array([[8, 4, 3, 1],
           [9, 7, 2, 2]])
    
    In [15]: pd.DataFrame(a, df.index, df.columns)
    Out[15]:
       A  B  C  D
    0  8  4  3  1
    1  9  7  2  2
    

    I had thought this might work, but it sorts the columns:

    In [21]: df.sort(axis=1, ascending=False)
    Out[21]:
       D  C  B  A
    0  1  8  4  3
    1  2  7  2  9
    

    Ah, pandas raises:

    In [22]: df.sort(df.columns, axis=1, ascending=False)
    

    ValueError: When sorting by column, axis must be 0 (rows)

提交回复
热议问题