From a Pandas newbie: I have data that looks essentially like this -
data1=pd.DataFrame({\'Dir\':[\'E\',\'E\',\'W\',\'W\',\'E\',\'W\',\'W\',\'E\'], \'Bool\'
As the other answer points out, you're trying to collapse identical dates into single rows, whereas the cumsum function will return a series of the same length as the original DataFrame. Stated differently, you actually want to group by [Bool, Dir, Date], calculate a sum in each group, THEN return a cumsum on rows grouped by [Bool, Dir]. The other answer is a perfectly valid solution to your specific question, here's a one-liner variation:
data1.groupby(['Bool', 'Dir', 'Date']).sum().groupby(level=[0, 1]).cumsum()
This returns output exactly in the requested format.
For those looking for a simple cumsum on a Pandas group, you can use:
data1.groupby(['Bool', 'Dir']).apply(lambda x: x['Data'].cumsum())
The cumulative sum is calculated internal to each group. Here's what the output looks like:
Bool Dir
N E 2000-12-30 5
2000-12-30 16
W 2001-01-02 7
2001-01-03 16
Y E 2000-12-30 4
2001-01-03 12
W 2000-12-30 6
2000-12-30 16
Name: Data, dtype: int64
Note the repeated dates, but this is doing a strict cumulative sum internal to the rows of each group identified by the Bool and Dir columns.