问题
I want to calculate the value change by group.
This is the python pandas dataframe df I have:
Group |   Date      | Value
  A     01-02-2016     16 
  A     01-03-2016     15 
  A     01-04-2016     14 
  A     01-05-2016     17 
  A     01-06-2016     19 
  A     01-07-2016     20 
  B     01-02-2016     16 
  B     01-03-2016     13 
  B     01-04-2016     13 
  C     01-02-2016     16 
  C     01-03-2016     16 
I want to calculate that for Group A, the values are going up, for Group B they are going down and for Group C they are not changing.
I am not sure how to approach it, since in Group A the values initially decrease and then increase. So should I look at the average change or most recent change?
Should I use pct_change? http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pct_change.html I was not sure how to specify the timeframe fot that.
df.groupby.pct_change
It would be great if I could visualize it too. Any advice or hint is greatly appreciated! Thank you
回答1:
use pct_change in a groupby
d1 = df.set_index(['Date', 'Group']).Value
d2 = d1.groupby(level='Group').pct_change()
print(d2)
Date        Group
2016-01-02  A             NaN
2016-01-03  A       -0.062500
2016-01-04  A       -0.066667
2016-01-05  A        0.214286
2016-01-06  A        0.117647
2016-01-07  A        0.052632
2016-01-02  B             NaN
2016-01-03  B       -0.187500
2016-01-04  B        0.000000
2016-01-02  C             NaN
2016-01-03  C        0.000000
Name: Value, dtype: float64
One of many ways to visualize and compare is to see how they grow. In this case, I'd
fillna(0)add(1)cumprod()
d2.fillna(0).add(1).cumprod().unstack().plot()
setup
from io import StringIO
import pandas as pd
txt = """Group   Date       Value
  A     01-02-2016     16 
  A     01-03-2016     15 
  A     01-04-2016     14 
  A     01-05-2016     17 
  A     01-06-2016     19 
  A     01-07-2016     20 
  B     01-02-2016     16 
  B     01-03-2016     13 
  B     01-04-2016     13 
  C     01-02-2016     16 
  C     01-03-2016     16 """
df = pd.read_clipboard(parse_dates=[1])
    来源:https://stackoverflow.com/questions/41453325/python-pandas-groupby-calculate-change