Keep the previous maximum value after the streak ends

This is my dataframe:

import pandas as pd 

df = pd.DataFrame(
    {
        'a': [110, 115, 112, 180, 150, 175, 160, 145, 200, 205, 208, 203, 206, 207, 208, 209, 210, 215],
        'b': [1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1],
    }
)

And this is the output that I want. I want to create column c.

      a  b    c
0   110  1  110
1   115  1  115
2   112  0  115
3   180  1  180
4   150  0  180
5   175  1  180
6   160  0  180
7   145  0  180
8   200  1  200
9   205  1  205
10  208  1  208
11  203  0  208
12  206  1  208
13  207  1  208
14  208  1  208
15  209  1  209
16  210  1  210
17  215  1  215

When df.a > df.a.shift(1) b is 1 otherwise it is 0.

Steps needed:

a) Find where the streak of 1 in b ends.

b) Keep the maximum value of the streak.

c) Put that value in c until a greater value is found in a.

For example when 180 is found in b:

a) Row 3 has streak of 1.

b) Maximum value of the streak is 180.

c) df.c = 180 until a greater value is found in a. In this case it is 200 at row 8.

It was not easy to elaborate the problem. Maybe I have described the problem with wrong words. So If there are any questions feel free to ask in the comments.

And I really appreciate if you introduce a built-in way or a clean way to create column b. I put those 1 and 0s manually.

This is what I have tried. But it does not feel like a correct approach.

df['streak'] = df['b'].ne(df['b'].shift()).cumsum()
df['max'] = df.groupby('streak')['a'].max()

You just want cummax:

df['c'] = df['a'].cummax()

Output:

      a  b    c
0   110  1  110
1   115  1  115
2   112  0  115
3   180  1  180
4   150  0  180
5   175  1  180
6   160  0  180
7   145  0  180
8   200  1  200
9   205  1  205
10  208  1  208
11  203  0  208
12  206  1  208
13  207  1  208
14  208  1  208
15  209  1  209
16  210  1  210
17  215  1  215

Leave a Comment