Can you use loc to select a range of columns plus a column outside of the range?

跟風遠走 提交于 2021-02-05 04:49:59

问题


Suppose I want to select a range of columns from a dataframe: Call them 'column_1' through 'column_60'. I know I could use loc like this: df.loc[:, 'column_1':'column_60'] That will give me all rows in columns 1-60.

But what if I wanted that range of columns plus 'column_81'. This doesn't work: df.loc[:, 'column_1':'column_60', 'column_81']

It throws a "Too many indexers" error. Is there another way to state this using loc? Or is loc even the best function to use in this case?

Many thanks.


回答1:


How about

df.loc[:, [f'column_{i}' for i in range(1, 61)] + ['column_81']]

or

df.reindex([f'column_{i}' for i in range(1, 61)] + ['column_81'], axis=1)

if you want to fill missing columns, if there are, with default NaN values.




回答2:


You can use pandas.concat():

pd.concat([df.loc[:,'column_1':'columns_60'],df.loc[:,'column_81']],axis=1)




回答3:


You can use numpy.r_ to combine ranges with scalars. The only complication is you need to use pd.DataFrame.iloc instead, but this can be facilitated via df.columns.get_loc.

Here's a demo:

import pandas as pd
import numpy as np

df = pd.DataFrame(columns=['column'+str(i) for i in range(1, 82)])

colidx = df.columns.get_loc

res = df.iloc[:, np.r_[colidx('column1'):colidx('column5'), colidx('column80')]]

print(res.columns)

Index(['column1', 'column2', 'column3', 'column4', 'column80'], dtype='object')



回答4:


You can use numpy concatenate funciton. Assuming you know the order of columns you can use:

df.loc[:,df.columns[np.concatenate([np.arange(1,60),np.array(81)],axis=None)]]

This gives you columns 1:60 plus column 81 from your data frame.



来源:https://stackoverflow.com/questions/50647832/can-you-use-loc-to-select-a-range-of-columns-plus-a-column-outside-of-the-range

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!