Sorting a pandas series

问题

I am trying to figure out how to sort the Series generated as a result of a groupby aggregation in a smart way.

I generate an aggregation of my DataFrame like this:

means = df.testColumn.groupby(df.testCategory).mean()

This results in a Series. I now try to sort this by value, but get an error:

means.sort()
...
-> Exception: This Series is a view of some other array, to sort in-place you must create a copy

I then try creating a copy:

meansCopy = Series(means)
meansCopy.sort()
-> Exception: This Series is a view of some other array, to sort in-place you must create a copy

How can I get this sort working?

回答1:

Use sort_values, i.e. means = means.sort_values(). [Pandas v0.17+]

(Very old answer, pre-v0.17 / 2015)

pandas used to use order() method: means = means.order().

回答2:

1) Use Series.sort_values()

# Setup.
np.random.seed(0)
df = pd.DataFrame({'A': list('aaabbbbccddd'), 'B': np.random.choice(5, 12)})
ser = df.groupby('A')['B'].mean()
ser

A
a    2.333333
b    2.500000
c    3.000000
d    1.333333
Name: B, dtype: float64

ser.sort_values()

A
d    1.333333
a    2.333333
b    2.500000
c    3.000000
Name: B, dtype: float64

1b) To sort in descending order: `sort_values(ascending=False)`

2) You can also call Series.argsort() and reindex with `getitem` / Series.iloc:

ser[ser.argsort()]

A
d    1.333333
a    2.333333
b    2.500000
c    3.000000
Name: B, dtype: float64

ser.iloc[ser.argsort()]

A
d    1.333333
a    2.333333
b    2.500000
c    3.000000
Name: B, dtype: float64

3) Similarly, numpy.argsort() (should be marginally faster):

ser[np.argsort(ser)]
# ser[np.argsort(ser.values)]

A
d    1.333333
a    2.333333
b    2.500000
c    3.000000
Name: B, dtype: float64

3b) To sort in descending order, negate the argument:

ser[(-ser).argsort()]

A
c    3.000000
b    2.500000
a    2.333333
d    1.333333
Name: B, dtype: float64

The process is the same for the other similar methods.

4) If you only care about the values (and not the index), use np.sort:

np.sort(ser)
# array([1.33333333, 2.33333333, 2.5       , 3.        ])

5) As a side note, in-place sorting (calling `.sort()` on `ser.values`) is possible but not recommended:

ser.values.sort() will sort the series' values in-place, but won't modify the index, so technically it is incorrect.

[Old pre-v0.17 /2015 methods: order, sort, sortUp, sortDown are deprecated]

来源：https://stackoverflow.com/questions/12133075/sorting-a-pandas-series

标签

python

pandas

sorting

series