Slicing multiple ranges of columns in Pandas, by list of names

前端 未结 2 719
悲哀的现实
悲哀的现实 2020-12-03 18:29

I am trying to select multiple columns in a Pandas dataframe in two different approaches:

1)via the columns number, for examples, columns 1-3 and columns 6 onwards.<

相关标签:
2条回答
  • 2020-12-03 19:06

    I’m not sure what exactly you are asking but in general DataFrame.loc allows you to select by label, DataFrame.iloc by index.

    For example selecting columns # 0, 1 and 4:

    dataframe.iloc[:, [0, 1, 4]]
    

    and selecting columns labelled 'A', 'B' and 'C':

    dataframe.loc[:, ['A', 'B', 'C']]
    
    0 讨论(0)
  • 2020-12-03 19:12

    I think you need numpy.r_ for concanecate positions of columns, then use iloc for selecting:

    print (df.iloc[:, np.r_[1:3, 6:len(df.columns)]])
    

    and for second approach subset by list:

    print (df[years_month])
    

    Sample:

    df = pd.DataFrame({'2000-1':[1,3,5],
                       '2000-2':[5,3,6],
                       '2000-3':[7,8,9],
                       '2000-4':[1,3,5],
                       '2000-5':[5,3,6],
                       '2000-6':[7,8,9],
                       '2000-7':[1,3,5],
                       '2000-8':[5,3,6],
                       '2000-9':[7,4,3],
                       'A':[1,2,3],
                       'B':[4,5,6],
                       'C':[7,8,9]})
    
    print (df)
       2000-1  2000-2  2000-3  2000-4  2000-5  2000-6  2000-7  2000-8  2000-9  A  \
    0       1       5       7       1       5       7       1       5       7  1   
    1       3       3       8       3       3       8       3       3       4  2   
    2       5       6       9       5       6       9       5       6       3  3   
    
       B  C  
    0  4  7  
    1  5  8  
    2  6  9  
    
    print (df.iloc[:, np.r_[1:3, 6:len(df.columns)]])
       2000-2  2000-3  2000-7  2000-8  2000-9  A  B  C
    0       5       7       1       5       7  1  4  7
    1       3       8       3       3       4  2  5  8
    2       6       9       5       6       3  3  6  9
    

    You can also sum of ranges (cast to list in python 3 is necessary):

    rng = list(range(1,3)) + list(range(6, len(df.columns)))
    print (rng)
    [1, 2, 6, 7, 8, 9, 10, 11]
    
    print (df.iloc[:, rng])
       2000-2  2000-3  2000-7  2000-8  2000-9  A  B  C
    0       5       7       1       5       7  1  4  7
    1       3       8       3       3       4  2  5  8
    2       6       9       5       6       3  3  6  9
    
    0 讨论(0)
提交回复
热议问题