Efficiently Return the Index of the First Value Satisfying Condition in Array

Efficiently return the index of the first value satisfying condition in array

numba

With numba it's possible to optimise both scenarios. Syntactically, you need only construct a function with a simple for loop:

from numba import njit

@njit
def get_first_index_nb(A, k):
for i in range(len(A)):
if A[i] > k:
return i
return -1

idx = get_first_index_nb(A, 0.9)

Numba improves performance by JIT ("Just In Time") compiling code and leveraging CPU-level optimisations. A regular for loop without the @njit decorator would typically be slower than the methods you've already tried for the case where the condition is met late.

For a Pandas numeric series df['data'], you can simply feed the NumPy representation to the JIT-compiled function:

idx = get_first_index_nb(df['data'].values, 0.9)

Generalisation

Since numba permits functions as arguments, and assuming the passed the function can also be JIT-compiled, you can arrive at a method to calculate the nth index where a condition is met for an arbitrary func.

@njit
def get_nth_index_count(A, func, count):
c = 0
for i in range(len(A)):
if func(A[i]):
c += 1
if c == count:
return i
return -1

@njit
def func(val):
return val > 0.9

# get index of 3rd value where func evaluates to True
idx = get_nth_index_count(arr, func, 3)

For the 3rd last value, you can feed the reverse, arr[::-1], and negate the result from len(arr) - 1, the - 1 necessary to account for 0-indexing.

Performance benchmarking

# Python 3.6.5, NumPy 1.14.3, Numba 0.38.0

np.random.seed(0)
arr = np.random.rand(10**7)
m = 0.9
n = 0.999999

@njit
def get_first_index_nb(A, k):
for i in range(len(A)):
if A[i] > k:
return i
return -1

def get_first_index_np(A, k):
for i in range(len(A)):
if A[i] > k:
return i
return -1

%timeit get_first_index_nb(arr, m) # 375 ns
%timeit get_first_index_np(arr, m) # 2.71 µs
%timeit next(iter(np.where(arr > m)[0]), -1) # 43.5 ms
%timeit next((idx for idx, val in enumerate(arr) if val > m), -1) # 2.5 µs

%timeit get_first_index_nb(arr, n) # 204 µs
%timeit get_first_index_np(arr, n) # 44.8 ms
%timeit next(iter(np.where(arr > n)[0]), -1) # 21.4 ms
%timeit next((idx for idx, val in enumerate(arr) if val > n), -1) # 39.2 ms

Python: return the index of the first element of a list which makes a passed function true

You could do that in a one-liner using generators:

next(i for i,v in enumerate(l) if is_odd(v))

The nice thing about generators is that they only compute up to the requested amount. So requesting the first two indices is (almost) just as easy:

y = (i for i,v in enumerate(l) if is_odd(v))
x1 = next(y)
x2 = next(y)

Though, expect a StopIteration exception after the last index (that is how generators work). This is also convenient in your "take-first" approach, to know that no such value was found --- the list.index() function would throw ValueError here.

Get the indexes of javascript array elements that satisfy condition

You can use Array#reduce method.

var data = [{prop1:"abc",prop2:"qwe"},{prop1:"abc",prop2:"yutu"},{prop1:"xyz",prop2:"qwrq"}];
console.log( data.reduce(function(arr, e, i) { if (e.prop1 == 'abc') arr.push(i); return arr; }, []))
// with ES6 arrow syntaxconsole.log( data.reduce((arr, e, i) => ((e.prop1 == 'abc') && arr.push(i), arr), []))

Is there a NumPy function to return the first index of something in an array?

Yes, given an array, array, and a value, item to search for, you can use np.where as:

itemindex = numpy.where(array == item)

The result is a tuple with first all the row indices, then all the column indices.

For example, if an array is two dimensions and it contained your item at two locations then

array[itemindex[0][0]][itemindex[1][0]]

would be equal to your item and so would be:

array[itemindex[0][1]][itemindex[1][1]]

Pandas dataframe: Remove secondary upcoming same value

You can find the index of the first 1 and set others to 0:

mask = df['col2'].eq(1)
df.loc[mask & (df.index != mask.idxmax()), 'col2'] = 0

For better performance, see Efficiently return the index of the first value satisfying condition in array.



Related Topics



Leave a reply



Submit