发表新帖

发表新帖

Pandas - possible to aggregate two columns using two different aggregations?

后端未结

关注

 1  1515

被撕碎了的回忆

I\'m loading a csv file, which has the following columns: date, textA, textB, numberA, numberB

I want to group by the columns: date, textA and textB - but want to ap

相关标签:

1条回答

悲&欢浪女

2020-12-15 17:22
The agg method can accept a dict, in which case the keys indicate the column to which the function is applied:
```
grouped.agg({'numberA':'sum', 'numberB':'min'})
```
For example,
```
import numpy as np
import pandas as pd
df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar',
                         'foo', 'bar', 'foo', 'foo'],
                   'B': ['one', 'one', 'two', 'three',
                         'two', 'two', 'one', 'three'],
                   'number A': np.arange(8),
                   'number B': np.arange(8) * 2})
grouped = df.groupby('A')

print(grouped.agg({
    'number A': 'sum',
    'number B': 'min'}))
```
yields
```
     number B  number A
A                      
bar         2         9
foo         0        19
```
This also shows that Pandas can handle spaces in column names. I'm not sure what the origin of the problem was, but literal spaces should not have posed a problem. If you wish to investigate this further,
```
print(df.columns)
```
without reassigning the column names, will show show us the repr of the names. Maybe there was a hard-to-see character in the column name that looked like a space (or some other character) but was actually a u'\xa0' (NO-BREAK SPACE), for example.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题