Show top n rows for every column in Pandas data frame

本秂侑毒 提交于 2021-02-08 09:04:37

问题


I have the following sample CSV.

,cid1,cid2,cid3
rid1,0.1,0.4,0.3
rid2,1.0,0.1,0.5
rid3,0.2,0.5,0.1
rid4,0.3,0.4,0.8
rid5,0.2,0.3,0.7
rid6,0.9,0.2,0.1
rid7,0.4,0.8,0.9
rid8,0.6,0.5,0.7
rid9,0.3,0.9,0.4

I want to show n rows with the highest value for every column in the file. Ideally, I would like to get the following output (for n = 3).

cid1  rid2  1.0
cid1  rid6  0.9
cid1  rid8  0.6
                    # Blank lines separating columns are optional.
cid2  rid9  0.9
cid2  rid7  0.8
cid2  rid8  0.5

cid3  rid7  0.9
cid3  rid4  0.8
cid3  rid8  0.7

This is what I have so far.

import pandas as pd

df = pd.read_csv("input.csv", index_col=0)
n = 3

for col in df:
    print(col)
    print(df.sort_values(col, ascending=False)[:n][col].to_string())

which gives me almost what I want.

cid1
rid2    1.0
rid6    0.9
rid8    0.6
cid2
rid9    0.9
rid7    0.8
rid3    0.5
cid3
rid7    0.9
rid4    0.8
rid5    0.7

Also, there is a small issue with cid2, for which rid8 and rid3 have the same value of 0.5. In that case, it would be perfect to show them both in separate lines, but I'm aware it will probably complicate the code much further so I would be happy with the basic solution.


回答1:


Are you looking for GroupBy.nlargest?

df.unstack().groupby(level=0, group_keys=False).nlargest(3)
 
cid1  rid2    1.0
      rid6    0.9
      rid8    0.6
cid2  rid9    0.9
      rid7    0.8
      rid3    0.5
cid3  rid7    0.9
      rid4    0.8
      rid5    0.7
dtype: float64

Trying to mirror OP's "expected" result:

print(df.unstack()
        .groupby(level=0, group_keys=False)
        .nlargest(3)
        .reset_index()
        .to_string(header=None, index=False))

cid1 rid2 1.0
cid1 rid6 0.9
cid1 rid8 0.6
cid2 rid9 0.9
cid2 rid7 0.8
cid2 rid3 0.5
cid3 rid7 0.9
cid3 rid4 0.8
cid3 rid5 0.7


来源:https://stackoverflow.com/questions/65438075/show-top-n-rows-for-every-column-in-pandas-data-frame

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!