Can you use loc to select a range of columns plus a column outside of the range?

问题

Suppose I want to select a range of columns from a dataframe: Call them 'column_1' through 'column_60'. I know I could use loc like this: df.loc[:, 'column_1':'column_60'] That will give me all rows in columns 1-60.

But what if I wanted that range of columns plus 'column_81'. This doesn't work: df.loc[:, 'column_1':'column_60', 'column_81']

It throws a "Too many indexers" error. Is there another way to state this using loc? Or is loc even the best function to use in this case?

Many thanks.

回答1:

How about

df.loc[:, [f'column_{i}' for i in range(1, 61)] + ['column_81']]

df.reindex([f'column_{i}' for i in range(1, 61)] + ['column_81'], axis=1)

if you want to fill missing columns, if there are, with default NaN values.

回答2:

You can use pandas.concat():

pd.concat([df.loc[:,'column_1':'columns_60'],df.loc[:,'column_81']],axis=1)

回答3:

You can use numpy.r_ to combine ranges with scalars. The only complication is you need to use pd.DataFrame.iloc instead, but this can be facilitated via df.columns.get_loc.

Here's a demo:

import pandas as pd
import numpy as np

df = pd.DataFrame(columns=['column'+str(i) for i in range(1, 82)])

colidx = df.columns.get_loc

res = df.iloc[:, np.r_[colidx('column1'):colidx('column5'), colidx('column80')]]

print(res.columns)

Index(['column1', 'column2', 'column3', 'column4', 'column80'], dtype='object')

回答4:

You can use numpy concatenate funciton. Assuming you know the order of columns you can use:

df.loc[:,df.columns[np.concatenate([np.arange(1,60),np.array(81)],axis=None)]]

This gives you columns 1:60 plus column 81 from your data frame.

来源：https://stackoverflow.com/questions/50647832/can-you-use-loc-to-select-a-range-of-columns-plus-a-column-outside-of-the-range

标签

python

python-3.x

pandas

dataframe