问题
I'm new to pandas and want to create a new dataset with grouped and filtered data. Right now, my dataset contains two columns looking like this (first column with A, B or C, second with value):
A 1
A 2
A 3
A 4
B 1
B 2
B 3
C 4
--> now I want to group by the keys of the first column (A,B,C) , and show only the keys, where the values 1 AND 2 exist. So that my new data set looks like:
A 1
A 2
B 1
B 2
Until now, I'm only able to print everything but I don't know how to filter:
for name, group in data.groupby('keys'):
print(name)
print(group)
I'm thankful for any help!
回答1:
You can use:
df = df.loc[(df['col2'] == 1) | (df['col2'] == 2)]
And then filter the groups that dont contains both values:
df = df.groupby('col1').filter(lambda x: any(x['col2'] == 2))
df = df.groupby('col1').filter(lambda x: any(x['col2'] == 1))
Example:
col1 col2
0 A 1
1 A 2
2 A 3
3 A 4
4 B 1
5 B 2
6 B 3
7 C 4
8 C 1
Output:
col1 col2
0 A 1
1 A 2
4 B 1
5 B 2
回答2:
You don't really need to groupby. Just use :
df = pd.DataFrame({'col_a': ['A','A','A','A', 'B','B','B', 'C'], 'col_b': [1,2,3,4,1,2,3,4]})
df.loc[(df.col_b == 1) | (df.col_b == 2)]
回答3:
try this,
l=[1,2]
print df[df['col2'].isin(l)]
For this problem you really don't need groupby,
If you want try this also,
df.groupby('col1').apply(lambda x:x[x['col2'].isin(l)]).reset_index(drop=True)
回答4:
Filter out all keys with values '1' or '2':
data = data.loc[ (data['value'] == 1) | (data['value'] == 2) ]
Then filter out only the keys you want to see:
data = data.loc[ (data['key'] == 'A') | (data['key'] == 'B') ]
来源:https://stackoverflow.com/questions/51377779/pandas-group-data-by-column-a-filter-a-by-existing-values-of-column-b