问题
I have OHLC data. The candle can be either 'green' (if the close is above open) or 'red' (if the open is above the close). The format is:
open close candletype
0 542 543 GREEN
1 543 544 GREEN
2 544 545 GREEN
3 545 546 GREEN
4 546 547 GREEN
5 547 542 RED
6 542 543 GREEN
What I would like is to count the number of consecutive green or red candles for n-previous rows. Lets say I want to identify rows preceded by 3 green candles.
That the desired output would be:
open close candletype pattern
0 542 543 GREEN Toofewrows
1 543 544 GREEN Toofewrows
2 544 545 GREEN Toofewrows
3 545 546 GREEN 3-GREEN-CANDLES-IN-A-ROW
4 546 547 GREEN 3-GREEN-CANDLES-IN-A-ROW
5 547 542 RED 3-GREEN-CANDLES-IN-A-ROW
6 542 543 GREEN No pattern
I know how to get the solution by extracting the row number, applying a custom function to candletype series with that row number and looking at n previous rows within that custom function, creating a n-item list and checking for isAll('GREEN') but I WAS WONDERING IF THERE IS AN ELEGANT ONE LINER APPLY SOLUTION?
回答1:
You can apply lambda functions to rolling windows. See Applying lambda function to a pandas rolling window series
You can either categorize them or map them on your own to some numbers:
df = pd.read_clipboard()
df['code'] = df.candletype.astype('category').cat.codes
This results in following DataFrame:
open close candletype code
0 542 543 GREEN 0
1 543 544 GREEN 0
2 544 545 GREEN 0
3 545 546 GREEN 0
4 546 547 GREEN 0
5 547 542 RED 1
6 542 543 GREEN 0
Now just apply df['code'].rolling(3).apply(lambda x: all(x==0)).shift()
, resulting in
0
NaN
1 NaN
2 NaN
3 1.0
4 1.0
5 1.0
6 0.0
and fill your nans
and zeros as expected/wanted.
This neither is a oneliner, but maybe more pretty than the string comparison. Hope it helps you!
回答2:
Rolling window works over numbers rather than strings, so factorize and apply and use set to check of equality i.e
df['new'] = pd.Series(df['candletype'].factorize()[0]).rolling(window=4).apply(lambda x : set(x[:-1]) == {0})
df['new'].replace({1:'Consective 3 Green',0:'No Pattern'})
0 NaN
1 NaN
2 NaN
3 Consective 3 Green
4 Consective 3 Green
5 Consective 3 Green
6 No Pattern
Name: new, dtype: object
Along side rolling apply you can play with zip too for this i.e
def get_list(x,m) :
x = zip(*(x[i:] for i in range(m)))
return ['3 Greens' if set(i[:-1]) == {'GREEN'} else 'no pattern' for i in x]
df['new'] = pd.Series(get_list(df['candletype'],4), index=df.index[4 - 1:])
open close candletype new
0 542 543 GREEN NaN
1 543 544 GREEN NaN
2 544 545 GREEN NaN
3 545 546 GREEN 3 Greens
4 546 547 GREEN 3 Greens
5 547 542 RED 3 Greens
6 542 543 GREEN no pattern
回答3:
This one-liner can count the number of consecutive occurences in your serie. However it is kind of tricky and therefore not so easy to read for other users or future you! It is very well explained in this post.
df = pd.read_clipboard()
df['pattern'] = df.groupby((df.candletype != df.candletype.shift()).cumsum()).cumcount()
df
>>> open close candletype pattern
0 542 543 GREEN 0
1 543 544 GREEN 1
2 544 545 GREEN 2
3 545 546 GREEN 3
4 546 547 GREEN 4
5 547 542 RED 0
6 542 543 GREEN 0
This is not exactly the output you provided but here you have the exact count of consecutive values. You can then apply any cosmetic details to this series (replace values below your threshold by Toofewrows
, etc.)
来源:https://stackoverflow.com/questions/48259470/pandas-number-of-consecutive-occurrences-in-previous-rows