Pandas: Group Data by column A, Filter A by existing values of column B

痴心易碎 提交于 2019-12-24 10:46:32

问题


I'm new to pandas and want to create a new dataset with grouped and filtered data. Right now, my dataset contains two columns looking like this (first column with A, B or C, second with value):

A 1 

A 2

A 3

A 4 

B 1

B 2

B 3 

C 4

--> now I want to group by the keys of the first column (A,B,C) , and show only the keys, where the values 1 AND 2 exist. So that my new data set looks like:

A 1

A 2

B 1

B 2

Until now, I'm only able to print everything but I don't know how to filter:

for name, group in data.groupby('keys'):
   print(name)
   print(group)

I'm thankful for any help!


回答1:


You can use:

df = df.loc[(df['col2'] == 1) | (df['col2'] == 2)]

And then filter the groups that dont contains both values:

df = df.groupby('col1').filter(lambda x: any(x['col2'] == 2))
df = df.groupby('col1').filter(lambda x: any(x['col2'] == 1))

Example:

  col1  col2
0    A     1
1    A     2
2    A     3
3    A     4
4    B     1
5    B     2
6    B     3
7    C     4
8    C     1

Output:

  col1  col2
0    A     1
1    A     2
4    B     1
5    B     2



回答2:


You don't really need to groupby. Just use :

df = pd.DataFrame({'col_a': ['A','A','A','A', 'B','B','B', 'C'], 'col_b': [1,2,3,4,1,2,3,4]})
df.loc[(df.col_b == 1) | (df.col_b == 2)]



回答3:


try this,

l=[1,2]
print df[df['col2'].isin(l)]

For this problem you really don't need groupby,

If you want try this also,

df.groupby('col1').apply(lambda x:x[x['col2'].isin(l)]).reset_index(drop=True)



回答4:


Filter out all keys with values '1' or '2':

data = data.loc[ (data['value'] == 1) | (data['value'] == 2) ]

Then filter out only the keys you want to see:

data = data.loc[ (data['key'] == 'A') | (data['key'] == 'B') ]


来源:https://stackoverflow.com/questions/51377779/pandas-group-data-by-column-a-filter-a-by-existing-values-of-column-b

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!