Skip to content Skip to sidebar Skip to footer

Pandas Number Of Consecutive Occurrences In Previous Rows

I have OHLC data. The candle can be either 'green' (if the close is above open) or 'red' (if the open is above the close). The format is: open close candletype 0 542 543 GREEN 1

Solution 1:

You can apply lambda functions to rolling windows. See Applying lambda function to a pandas rolling window series

You can either categorize them or map them on your own to some numbers:

df = pd.read_clipboard()
df['code'] = df.candletype.astype('category').cat.codes

This results in following DataFrame:

openclose   candletype  code
0542543 GREEN   01543544 GREEN   02544545 GREEN   03545546 GREEN   04546547 GREEN   05547542 RED 16542543 GREEN   0

Now just apply df['code'].rolling(3).apply(lambda x: all(x==0)).shift(), resulting in 0

NaN1NaN2NaN31.041.051.060.0

and fill your nans and zeros as expected/wanted.

This neither is a oneliner, but maybe more pretty than the string comparison. Hope it helps you!

Solution 2:

Rolling window works over numbers rather than strings, so factorize and apply and use set to check of equality i.e

df['new'] = pd.Series(df['candletype'].factorize()[0]).rolling(window=4).apply(lambda x : set(x[:-1]) == {0})

df['new'].replace({1:'Consective 3 Green',0:'No Pattern'})

0                   NaN
1                   NaN
2                   NaN
3    Consective 3 Green
4    Consective 3 Green
5    Consective 3 Green
6            No Pattern
Name: new, dtype: object

Along side rolling apply you can play with zip too for this i.e

defget_list(x,m) : 
    x = zip(*(x[i:] for i inrange(m)))
    return ['3 Greens'ifset(i[:-1]) == {'GREEN'} else'no pattern'for i in x]

df['new'] = pd.Series(get_list(df['candletype'],4), index=df.index[4 - 1:])

   open  close candletype         new
0542543      GREEN         NaN
1543544      GREEN         NaN
2544545      GREEN         NaN
3545546      GREEN    3 Greens
4546547      GREEN    3 Greens
5547542        RED    3 Greens
6542543      GREEN  no pattern

Solution 3:

This one-liner can count the number of consecutive occurences in your serie. However it is kind of tricky and therefore not so easy to read for other users or future you! It is very well explained in this post.

df = pd.read_clipboard()
df['pattern'] = df.groupby((df.candletype != df.candletype.shift()).cumsum()).cumcount()
df
>>>    open  close candletype  pattern
0   542    543      GREEN        0
1   543    544      GREEN        1
2   544    545      GREEN        2
3   545    546      GREEN        3
4   546    547      GREEN        4
5   547    542        RED        0
6   542    543      GREEN        0

This is not exactly the output you provided but here you have the exact count of consecutive values. You can then apply any cosmetic details to this series (replace values below your threshold by Toofewrows, etc.)

Post a Comment for "Pandas Number Of Consecutive Occurrences In Previous Rows"