Sort dataframe by length of a string column [duplicate]

一曲冷凌霜 提交于 2020-01-10 05:26:05

问题


Using Python. I have a dataframe with three columns:

Author | Title | Reviews

I want to sort by the length of the string in the Reviews column.

If I do

df.sort_values('Review', ascending = False)

It sorts alphabetically, starting with 'z'.

How do I get it to sort by the length of the string in the Reviews column?


回答1:


I think you need len for lengths assign to index, sort_index and last reset_index:

df = pd.DataFrame({'Author':list('abcdef'),
                   'Title ':list('abcdef'),
                   'Review':['aa', 'aasdd', 'dwd','dswee dass', 'a', 'sds']})

print (df)
  Author      Review Title 
0      a          aa      a
1      b       aasdd      b
2      c         dwd      c
3      d  dswee dass      d
4      e           a      e
5      f         sds      f

df.index = df['Review'].str.len()
df = df.sort_index(ascending=False).reset_index(drop=True)
print (df)
  Author      Review Title 
0      d  dswee dass      d
1      b       aasdd      b
2      c         dwd      c
3      f         sds      f
4      a          aa      a
5      e           a      e



回答2:


Option 1
Using df.argsort and df.reindex

df

   Review
0     abc
1  foo123
2       b

df = df.reindex((-df.Review.str.len()).argsort()).reset_index(drop=True)
df

  Review
0  foo123
1     abc
2       b

Option 2
Similar solution using np.argsort

df = df.reindex(np.argsort(-df.Review.str.len())).reset_index(drop=True)
df

   Review
0  foo123
1     abc
2       b

Option 3
Using df.sort_values and df.iloc

df = df.iloc[(-df.Review.str.len()).argsort()].reset_index(drop=True)
df

   Review
0  foo123
1     abc
2       b


来源:https://stackoverflow.com/questions/46177362/sort-dataframe-by-length-of-a-string-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!