﻿ Python Pandas- Find the First Instance of a Value Exceeding a Threshold - ITCodar

# Python Pandas- Find the First Instance of a Value Exceeding a Threshold

## Python Pandas- Find the first instance of a value exceeding a threshold

By using `idxmax`

``df.loc[(df.Value>3).groupby(df.Trace).idxmax]Out[602]:    Date  Trace  Value2     3      1    3.15     2      2    3.6``

## pandas - find first occurrence

`idxmax` and `argmax` will return the position of the maximal value or the first position if the maximal value occurs more than once.

use `idxmax` on `df.A.ne('a')`

``df.A.ne('a').idxmax()3``

or the `numpy` equivalent

``(df.A.values != 'a').argmax()3``

However, if `A` has already been sorted, then we can use `searchsorted`

``df.A.searchsorted('a', side='right')array([3])``

Or the `numpy` equivalent

``df.A.values.searchsorted('a', side='right')3``

## Pandas index value relating to a threshold exceedance

It seems like `idxmax` could fit the bill:

``In [44]: x = pd.Series([1,2,3,4,5], index=pd.date_range('2000-1-1', periods=5, freq='M'))In [45]: xOut[45]: 2000-01-31    12000-02-29    22000-03-31    32000-04-30    42000-05-31    5Freq: M, dtype: int64In [46]: x >= 3Out[46]: 2000-01-31    False2000-02-29    False2000-03-31     True2000-04-30     True2000-05-31     TrueFreq: M, dtype: boolIn [47]: (x >= 3).idxmax()Out[47]: Timestamp('2000-03-31 00:00:00', tz=None)``

## find the first occurrence of a specific value in different groups

Check `drop_duplicates`

``idx = df[df.acc.eq(0.9)].drop_duplicates('id').indexOut[64]: Int64Index([1, 4], dtype='int64')``

## Find first occurrence of value in dataframe based on another dataframe with a shared column

``import pandas as pddf1 = pd.DataFrame({    "Trace": [1,1,1,1,1,2,2,2,2,2],    "Sample": [1,2,3,4,5,1,2,3,4,5],    "Signal": [2,3,5,6,1,8,9,5,4,3],})df2 = pd.DataFrame({    "Trace": [1,2],    "Sample": [4,2],    "Signal": [2,4]},)df3 = df1.merge(    df2[['Trace', 'Signal']],    on='Trace')mask = (df3.Signal_x > 2 * df3.Signal_y)df3 = df3.loc[mask]mask = ~df3.duplicated('Trace')df3 = df3.loc[mask]``

where the resulting `df3` should look as follows:

``   Trace  Sample    Signal_x    Signal_y2      1       3           5           26      2       2           9           4``