Pandas: DataFrame filtering using groupby and a function

前端未结

关注

 2  1871

名媛妹妹 2021-01-06 09:44

Using Python 3.3 and Pandas 0.10

I have a DataFrame that is built from concatenating multiple CSV files. First, I filter out all values in the Name column that conta

2条回答

死守一世寂寞 (楼主)

2021-01-06 10:02

You could first drop the duplicates:

In [11]: df = df.drop_duplicates()

In [12]: df
Out[12]:
  Name ID
0    A  1
1    B  2
2    C  3
4    E  4
5    F  4

The groupby id and only consider those with one element:

In [13]: g = df.groupby('ID')

In [14]: size = (g.size() == 1)

In [15]: size
Out[15]:
ID
1      True
2      True
3      True
4     False
dtype: bool

In [16]: size[size].index
Out[16]: Int64Index([1, 2, 3], dtype=int64)

In [17]: df['ID'].isin(size[size].index)
Out[17]:
0     True
1     True
2     True
4    False
5    False
Name: ID, dtype: bool

And boolean index by this:

In [18]: df[df['ID'].isin(size[size].index)]
Out[18]:
  Name ID
0    A  1
1    B  2
2    C  3

0 讨论(0)

查看其它2个回答