How to find first local maximum for every group?

There's definitely a nicer way to do this, but I don't recall it.

Cumprod is a very nice trick that has eluded me for ever, gotta remember that. +1'd

@cᴏʟᴅsᴘᴇᴇᴅ you can do a similar trick with cumsum if you can make the changes (rows that "change") True... then you can groupby the result for example, I think that's what I was thinking of/trying to recall.

user1357015 Over a year ago

@AndyHayden: I think this is it -- one question though, if there are multiple tests, how do I groupBy test. Would it be: df[((df.groupby("Test"). Value.diff().fillna(1) > 0).cumprod()) == 1].tail(1)

piRSquared · Accepted Answer · 2017-10-22 06:00:51Z

6

Use np.diff, it will naturally reduce the length of array by one and when I use np.flatnonzero it will identify the ordinal positions prior.

df.iloc[[np.flatnonzero(np.diff(df.Value) < 0)[0]]]

  Test  Parameter     Value
3   X1          3  0.800632

Note:
We can speed this up by accessing the underlying numpy array

df.iloc[[np.flatnonzero(np.diff(df.Value.values) < 0)[0]]]

Explanation

Get differences

np.diff(df.Value)

array([ 0.754618,  0.003467,  0.009262, -0.012167, -0.002847,  0.036802])

Find where differences are negative

np.flatnonzero(np.diff(df.Value) < 0)

array([3, 4])

I want the first one

np.flatnonzero(np.diff(df.Value) < 0)[0]

3

Use double brackets in an iloc

df.iloc[[3]]

  Test  Parameter     Value
3   X1          3  0.800632

The Group By Looks Like

f = lambda d: d.iloc[[np.flatnonzero(np.diff(d.Value.values) < 0)[0]]]
df.groupby('Test').apply(f)

       Test  Parameter     Value
Test                            
X1   3   X1          3  0.800632

edited Oct 22, 2017 at 6:00

answered Oct 22, 2017 at 5:55

piRSquared

296k68 gold badges512 silver badges657 bronze badges

2 Comments

Awesome super sir. you can also add the groupby method na. Op wants that for multiple test cases

I mean df.groupby('Test',as_index=False).apply(lambda x : x.iloc[np.flatnonzero(np.diff(x.Value) < 0)[0]]). You might have better

coldspeed95 · Accepted Answer · 2017-10-22 05:36:39Z

3

Use diff + tail:

df    
  Test  Parameter     Value
0   X1          0  0.033285
1   X1          1  0.787903
2   X1          2  0.791370
3   X1          3  0.800632
4   X1          4  0.788465
5   X1          5  0.785618

df[df.Value.diff().gt(0)].tail(1)    
  Test  Parameter     Value
3   X1          3  0.800632

This will retrieve the last local minima. If you want the first local minima, refer to Andy Hayden's solution involving cumprod.

If you're doing this in a groupby operation, it'd be something like (borrowing from Andy):

df.groupby('Test', group_keys=False)\
      .apply(lambda x: x[((x.Value.diff().fillna(1) > 0).cumprod()) == 1].tail(1))

edited Oct 22, 2017 at 5:36

answered Oct 22, 2017 at 5:18

coldspeed95

407k106 gold badges746 silver badges799 bronze badges

5 Comments

No no no not the one. I used the same thing with shift but it has to be every increasing value

@Bharathshetty yeah, I think tail should handle that. Anyway, I'll wait for OP.

There had to be two rows 3 and 22

@Bharathshetty Essentially, I'm trying to get the "last" number before the "first" decrease value. So just one.

docs.scipy.org/doc/scipy-0.19.1/reference/generated/…

@Bharathshetty and cᴏʟᴅsᴘᴇᴇᴅ I think OP wants the first local maxima

Bharath M Shetty · Accepted Answer · 2017-10-22 05:50:48Z

3

Also from scipy argrelextrema we can do (From finding local maximas)

from scipy.signal import argrelextrema
maxInd = argrelextrema(df['Value'].values, np.greater)
df.iloc[maxInd[0][:1]]

Test  Parameter     Value
3   X1          3  0.800632

A groupby solution if you have a dataframe i.e


 Test  Parameter     Value
0   X1          0  0.033285
1   X1          1  0.787903
2   X1          2  0.791370
3   X1          3  0.800632
4   X1          4  0.788465
5   X2          5  0.785618
6   X2         22  0.822420
7   X2          5  0.785618

def get_maxima(x):
    return x.iloc[argrelextrema(x['Value'].values,np.greater)[0][:1]]

df.groupby('Test').apply(get_maxima)

Output :

    Test  Parameter     Value
0 3   X1          3  0.800632
1 6   X2         22  0.822420

edited Oct 22, 2017 at 5:50

answered Oct 22, 2017 at 5:36

Bharath M Shetty

30.7k6 gold badges66 silver badges112 bronze badges

3 Comments

BENY Over a year ago

I think this needs to be df.iloc[maxInd[0][:1]], but very neat!

Also added a groupby approach for multiple maximas . Thank you sir

BENY · Accepted Answer · 2017-10-22 06:20:47Z

3

I think max can do it ...

df.sort_values('Value', ascending=False).drop_duplicates(['Test'])
Out[226]: 
  Test  Parameter     Value
3   X1          3  0.800632

Or

df[df['Value'] == df.groupby(['Test'])['Value'].transform(max)]
Out[227]: 
  Test  Parameter     Value
3   X1          3  0.800632

Seems this is what your need ...anyway using ugly way to correct my old post~ .

df1=df.loc[(df.Value.diff().fillna(1) > 0).nonzero()[0]].reset_index()
df1.groupby(df1['index'].diff().ne(1).cumsum()).last().iloc[0,]
Out[289]: 
index               3
Test               X1
Parameter           3
Value        0.800632
Name: 1, dtype: object

For groupby

l=[]
for _,dfs in df.groupby('Test'):
    df1 = dfs.loc[(dfs.Value.diff().fillna(1) > 0).nonzero()[0]].reset_index()
    l.append(df1.groupby(df1['index'].diff().ne(1).cumsum()).last().iloc[0,].to_frame().T)


pd.concat(l,axis=0)

edited Oct 22, 2017 at 6:20

answered Oct 22, 2017 at 5:25

BENY

324k22 gold badges176 silver badges250 bronze badges

6 Comments

Nice one. +1 Max should find the global maxima.

I agree. I meant to say that max is a very sensible solution!