How can I strip the whitespace from Pandas DataFrame headers?

前端 未结 3 892
礼貌的吻别
礼貌的吻别 2020-12-02 08:06

I am parsing data from an Excel file that has extra white space in some of the column headings.

When I check the columns of the resulting dataframe, with df.co

3条回答
  •  盖世英雄少女心
    2020-12-02 08:28

    You can now just call .str.strip on the columns if you're using a recent version:

    In [5]:
    df = pd.DataFrame(columns=['Year', 'Month ', 'Value'])
    print(df.columns.tolist())
    df.columns = df.columns.str.strip()
    df.columns.tolist()
    
    ['Year', 'Month ', 'Value']
    Out[5]:
    ['Year', 'Month', 'Value']
    

    Timings

    In[26]:
    df = pd.DataFrame(columns=[' year', ' month ', ' day', ' asdas ', ' asdas', 'as ', '  sa', ' asdas '])
    df
    Out[26]: 
    Empty DataFrame
    Columns: [ year,  month ,  day,  asdas ,  asdas, as ,   sa,  asdas ]
    
    
    %timeit df.rename(columns=lambda x: x.strip())
    %timeit df.columns.str.strip()
    1000 loops, best of 3: 293 µs per loop
    10000 loops, best of 3: 143 µs per loop
    

    So str.strip is ~2X faster, I expect this to scale better for larger dfs

提交回复
热议问题