可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a dataframe
with over 200 columns (don't ask why). The issue is as they were generated the order is
['Q1.3','Q6.1','Q1.2','Q1.1',......]
I need to re-order the columns as follows:
['Q1.1','Q1.2','Q1.3',.....'Q6.1',......]
Is there some way for me to do this within python?
回答1:
df.reindex_axis(sorted(df.columns), axis=1)
This assumes that sorting the column names will give the order you want. If your column names won't sort lexicographically (e.g., if you want column Q10.3 to appear after Q9.1), you'll need to sort differently, but that has nothing to do with pandas.
回答2:
You can also do more succinctly:
df.sort_index(axis=1)
Edit:
Make sure you hold the value
df = df.sort_index(axis=1)
Or do it in place
df.sort_index(axis=1, inplace=True)
回答3:
You can just do:
df[sorted(df.columns)]
回答4:
Tweet's answer can be passed to BrenBarn's answer above with
data.reindex_axis(sorted(data.columns, key=lambda x: float(x[1:])), axis=1)
So for your example, say:
vals = randint(low=16, high=80, size=25).reshape(5,5) cols = ['Q1.3', 'Q6.1', 'Q1.2', 'Q9.1', 'Q10.2'] data = DataFrame(vals, columns = cols)
You get:
data Q1.3 Q6.1 Q1.2 Q9.1 Q10.2 0 73 29 63 51 72 1 61 29 32 68 57 2 36 49 76 18 37 3 63 61 51 30 31 4 36 66 71 24 77
Then do:
data.reindex_axis(sorted(data.columns, key=lambda x: float(x[1:])), axis=1)
resulting in:
data Q1.2 Q1.3 Q6.1 Q9.1 Q10.2 0 2 0 1 3 4 1 7 5 6 8 9 2 2 0 1 3 4 3 2 0 1 3 4 4 2 0 1 3 4
回答5:
Don't forget to add "inplace=True" to Wes' answer or set the result to a new DataFrame.
df.sort_index(axis=1, inplace=True)
回答6:
If you need an arbitrary sequence instead of sorted sequence, you could do:
sequence = ['Q1.1','Q1.2','Q1.3',.....'Q6.1',......] your_dataframe = your_dataframe.reindex(columns=sequence)
I tested this in 2.7.10 and it worked for me.
回答7:
For several columns, You can put columns order what you want:
#['A', 'B', 'C']
This example shows sorting and slicing columns:
d = {'col1':[1, 2, 3], 'col2':[4, 5, 6], 'col3':[7, 8, 9], 'col4':[17, 18, 19]} df = pandas.DataFrame(d)
You get:
col1 col2 col3 col4 1 4 7 17 2 5 8 18 3 6 9 19
Then do:
df = df[['col3', 'col2', 'col1']]
Resulting in:
col3 col2 col1 7 4 1 8 5 2 9 6 3
回答8:
The quickest method is:
df.sort_index(axis=1)
Be aware that this creates a new instance. Therefore you need to store the result in a new variable:
sortedDf=df.sort_index(axis=1)
回答9:
The sort
method and sorted
function allow you to provide a custom function to extract the key used for comparison:
>>> ls = ['Q1.3', 'Q6.1', 'Q1.2'] >>> sorted(ls, key=lambda x: float(x[1:])) ['Q1.2', 'Q1.3', 'Q6.1']
回答10:
One use-case is that you have named (some of) your columns with some prefix, and you want the columns sorted with those prefixes all together and in some particular order (not alphabetical).
For example, you might start all of your features with Ft_
, labels with Lbl_
, etc, and you want all unprefixed columns first, then all features, then the label. You can do this with the following function (I will note a possible efficiency problem using sum
to reduce lists, but this isn't an issue unless you have a LOT of columns, which I do not):
def sortedcols(df, groups = ['Ft_', 'Lbl_'] ): return df[ sum([list(filter(re.compile(r).search, list(df.columns).copy())) for r in (lambda l: ['^(?!(%s))' % '|'.join(l)] + ['^%s' % i for i in l ] )(groups) ], []) ]
回答11:
print df.sort_index(by='Frequency',ascending=False)
where by is the name of the column,if you want to sort the dataset based on column