Pandas: DataFrame filtering using groupby and a function

前端 未结 2 1871
名媛妹妹
名媛妹妹 2021-01-06 09:44

Using Python 3.3 and Pandas 0.10

I have a DataFrame that is built from concatenating multiple CSV files. First, I filter out all values in the Name column that conta

2条回答
  •  死守一世寂寞
    2021-01-06 10:02

    You could first drop the duplicates:

    In [11]: df = df.drop_duplicates()
    
    In [12]: df
    Out[12]:
      Name ID
    0    A  1
    1    B  2
    2    C  3
    4    E  4
    5    F  4
    

    The groupby id and only consider those with one element:

    In [13]: g = df.groupby('ID')
    
    In [14]: size = (g.size() == 1)
    
    In [15]: size
    Out[15]:
    ID
    1      True
    2      True
    3      True
    4     False
    dtype: bool
    
    In [16]: size[size].index
    Out[16]: Int64Index([1, 2, 3], dtype=int64)
    
    In [17]: df['ID'].isin(size[size].index)
    Out[17]:
    0     True
    1     True
    2     True
    4    False
    5    False
    Name: ID, dtype: bool
    

    And boolean index by this:

    In [18]: df[df['ID'].isin(size[size].index)]
    Out[18]:
      Name ID
    0    A  1
    1    B  2
    2    C  3
    

提交回复
热议问题