Pandas - possible to aggregate two columns using two different aggregations?

后端 未结 1 1510
被撕碎了的回忆
被撕碎了的回忆 2020-12-15 16:40

I\'m loading a csv file, which has the following columns: date, textA, textB, numberA, numberB

I want to group by the columns: date, textA and textB - but want to ap

相关标签:
1条回答
  • 2020-12-15 17:22

    The agg method can accept a dict, in which case the keys indicate the column to which the function is applied:

    grouped.agg({'numberA':'sum', 'numberB':'min'})
    

    For example,

    import numpy as np
    import pandas as pd
    df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar',
                             'foo', 'bar', 'foo', 'foo'],
                       'B': ['one', 'one', 'two', 'three',
                             'two', 'two', 'one', 'three'],
                       'number A': np.arange(8),
                       'number B': np.arange(8) * 2})
    grouped = df.groupby('A')
    
    print(grouped.agg({
        'number A': 'sum',
        'number B': 'min'}))
    

    yields

         number B  number A
    A                      
    bar         2         9
    foo         0        19
    

    This also shows that Pandas can handle spaces in column names. I'm not sure what the origin of the problem was, but literal spaces should not have posed a problem. If you wish to investigate this further,

    print(df.columns)
    

    without reassigning the column names, will show show us the repr of the names. Maybe there was a hard-to-see character in the column name that looked like a space (or some other character) but was actually a u'\xa0' (NO-BREAK SPACE), for example.

    0 讨论(0)
提交回复
热议问题