问题
I am trying to figure out how to sort the Series generated as a result of a groupby aggregation in a smart way.
I generate an aggregation of my DataFrame like this:
means = df.testColumn.groupby(df.testCategory).mean()
This results in a Series. I now try to sort this by value, but get an error:
means.sort()
...
-> Exception: This Series is a view of some other array, to sort in-place you must create a copy
I then try creating a copy:
meansCopy = Series(means)
meansCopy.sort()
-> Exception: This Series is a view of some other array, to sort in-place you must create a copy
How can I get this sort working?
回答1:
Use sort_values
, i.e. means = means.sort_values()
. [Pandas v0.17+]
(Very old answer, pre-v0.17 / 2015)
pandas used to use order()
method: means = means.order()
.
回答2:
1) Use Series.sort_values()
# Setup.
np.random.seed(0)
df = pd.DataFrame({'A': list('aaabbbbccddd'), 'B': np.random.choice(5, 12)})
ser = df.groupby('A')['B'].mean()
ser
A
a 2.333333
b 2.500000
c 3.000000
d 1.333333
Name: B, dtype: float64
ser.sort_values()
A
d 1.333333
a 2.333333
b 2.500000
c 3.000000
Name: B, dtype: float64
1b) To sort in descending order: sort_values(ascending=False)
2) You can also call Series.argsort() and reindex with __getitem__
/ Series.iloc:
ser[ser.argsort()]
A
d 1.333333
a 2.333333
b 2.500000
c 3.000000
Name: B, dtype: float64
ser.iloc[ser.argsort()]
A
d 1.333333
a 2.333333
b 2.500000
c 3.000000
Name: B, dtype: float64
3) Similarly, numpy.argsort() (should be marginally faster):
ser[np.argsort(ser)]
# ser[np.argsort(ser.values)]
A
d 1.333333
a 2.333333
b 2.500000
c 3.000000
Name: B, dtype: float64
3b) To sort in descending order, negate the argument:
ser[(-ser).argsort()]
A
c 3.000000
b 2.500000
a 2.333333
d 1.333333
Name: B, dtype: float64
The process is the same for the other similar methods.
4) If you only care about the values (and not the index), use np.sort:
np.sort(ser)
# array([1.33333333, 2.33333333, 2.5 , 3. ])
5) As a side note, in-place sorting (calling .sort()
on ser.values
) is possible but not recommended:
ser.values.sort()
will sort the series' values in-place, but won't modify the index, so technically it is incorrect.
[Old pre-v0.17 /2015 methods: order
, sort
, sortUp
, sortDown
are deprecated]
来源:https://stackoverflow.com/questions/12133075/sorting-a-pandas-series