Remove item in pandas dataframe that starts with a comment char [duplicate]

冷暖自知 提交于 2021-02-10 05:54:59

问题


I would like to remove all rows in a pandas dataframe that starts with a comment character. For example:

>>> COMMENT_CHAR = '#'
>>> df
    first_name    last_name
0   #fill in here fill in here
1   tom           jones

>>> df.remove(df.columns[0], startswith=COMMENT_CHAR) # in pseudocode
>>> df
    first_name    last_name
0   tom           jones

How would this actually be done?


回答1:


Setup

>>> data = [['#fill in here', 'fill in here'], ['tom', 'jones']]                                                       
>>> df = pd.DataFrame(data, columns=['first_name', 'last_name'])                                                       
>>> df                                                                                                                 
      first_name     last_name
0  #fill in here  fill in here
1            tom         jones

Solution assuming only the strings in the first_name column matter:

>>> commented = df['first_name'].str.startswith('#')                                                                   
>>> df[~commented].reset_index(drop=True)                                                                              
  first_name last_name
0        tom     jones

Solution assuming you want to drop rows where the string in the first_name OR last_name column starts with '#':

>>> commented = df.apply(lambda col: col.str.startswith('#')).any(axis=1)                                             
>>> df[~commented].reset_index(drop=True)                                                                              
  first_name last_name
0        tom     jones

The purpose of reset_index is to re-label the rows starting from zero.

>>> df[~commented]                                                                                                     
  first_name last_name
1        tom     jones
>>>                                                                                                                    
>>> df[~commented].reset_index()                                                                                       
   index first_name last_name
0      1        tom     jones
>>>                                                                                                                    
>>> df[~commented].reset_index(drop=True)                                                                              
  first_name last_name
0        tom     jones


来源:https://stackoverflow.com/questions/53823260/remove-item-in-pandas-dataframe-that-starts-with-a-comment-char

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!