pandas: best way to select all columns whose names start with X

前端 未结 8 1188
别那么骄傲
别那么骄傲 2020-11-27 10:03

I have a DataFrame:

import pandas as pd
import numpy as np

df = pd.DataFrame({\'foo.aa\': [1, 2.1, np.nan, 4.7, 5.6, 6.8],
                   \'foo.fighters         


        
8条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-11-27 10:31

    Now that pandas' indexes support string operations, arguably the simplest and best way to select columns beginning with 'foo' is just:

    df.loc[:, df.columns.str.startswith('foo')]
    

    Alternatively, you can filter column (or row) labels with df.filter(). To specify a regular expression to match the names beginning with foo.:

    >>> df.filter(regex=r'^foo\.', axis=1)
       foo.aa  foo.bars  foo.fighters  foo.fox foo.manchu
    0     1.0         0             0        2         NA
    1     2.1         0             1        4          0
    2     NaN         0           NaN        1          0
    3     4.7         0             0        0          0
    4     5.6         0             0        0          0
    5     6.8         1             0        5          0
    

    To select only the required rows (containing a 1) and the columns, you can use loc, selecting the columns using filter (or any other method) and the rows using any:

    >>> df.loc[(df == 1).any(axis=1), df.filter(regex=r'^foo\.', axis=1).columns]
       foo.aa  foo.bars  foo.fighters  foo.fox foo.manchu
    0     1.0         0             0        2         NA
    1     2.1         0             1        4          0
    2     NaN         0           NaN        1          0
    5     6.8         1             0        5          0
    

提交回复
热议问题