Efficiently return the index of the first value satisfying condition in array
numba
With numba
it's possible to optimise both scenarios. Syntactically, you need only construct a function with a simple for
loop:
from numba import njit
@njit
def get_first_index_nb(A, k):
for i in range(len(A)):
if A[i] > k:
return i
return -1
idx = get_first_index_nb(A, 0.9)
Numba improves performance by JIT ("Just In Time") compiling code and leveraging CPU-level optimisations. A regular for
loop without the @njit
decorator would typically be slower than the methods you've already tried for the case where the condition is met late.
For a Pandas numeric series df['data']
, you can simply feed the NumPy representation to the JIT-compiled function:
idx = get_first_index_nb(df['data'].values, 0.9)
Generalisation
Since numba
permits functions as arguments, and assuming the passed the function can also be JIT-compiled, you can arrive at a method to calculate the nth index where a condition is met for an arbitrary func
.
@njit
def get_nth_index_count(A, func, count):
c = 0
for i in range(len(A)):
if func(A[i]):
c += 1
if c == count:
return i
return -1
@njit
def func(val):
return val > 0.9
# get index of 3rd value where func evaluates to True
idx = get_nth_index_count(arr, func, 3)
For the 3rd last value, you can feed the reverse, arr[::-1]
, and negate the result from len(arr) - 1
, the - 1
necessary to account for 0-indexing.
Performance benchmarking
# Python 3.6.5, NumPy 1.14.3, Numba 0.38.0
np.random.seed(0)
arr = np.random.rand(10**7)
m = 0.9
n = 0.999999
@njit
def get_first_index_nb(A, k):
for i in range(len(A)):
if A[i] > k:
return i
return -1
def get_first_index_np(A, k):
for i in range(len(A)):
if A[i] > k:
return i
return -1
%timeit get_first_index_nb(arr, m) # 375 ns
%timeit get_first_index_np(arr, m) # 2.71 µs
%timeit next(iter(np.where(arr > m)[0]), -1) # 43.5 ms
%timeit next((idx for idx, val in enumerate(arr) if val > m), -1) # 2.5 µs
%timeit get_first_index_nb(arr, n) # 204 µs
%timeit get_first_index_np(arr, n) # 44.8 ms
%timeit next(iter(np.where(arr > n)[0]), -1) # 21.4 ms
%timeit next((idx for idx, val in enumerate(arr) if val > n), -1) # 39.2 ms
Python: return the index of the first element of a list which makes a passed function true
You could do that in a one-liner using generators:
next(i for i,v in enumerate(l) if is_odd(v))
The nice thing about generators is that they only compute up to the requested amount. So requesting the first two indices is (almost) just as easy:
y = (i for i,v in enumerate(l) if is_odd(v))
x1 = next(y)
x2 = next(y)
Though, expect a StopIteration exception after the last index (that is how generators work). This is also convenient in your "take-first" approach, to know that no such value was found --- the list.index() function would throw ValueError here.
Get the indexes of javascript array elements that satisfy condition
You can use Array#reduce
method.
var data = [{prop1:"abc",prop2:"qwe"},{prop1:"abc",prop2:"yutu"},{prop1:"xyz",prop2:"qwrq"}];
console.log( data.reduce(function(arr, e, i) { if (e.prop1 == 'abc') arr.push(i); return arr; }, []))
// with ES6 arrow syntaxconsole.log( data.reduce((arr, e, i) => ((e.prop1 == 'abc') && arr.push(i), arr), []))
Is there a NumPy function to return the first index of something in an array?
Yes, given an array, array
, and a value, item
to search for, you can use np.where
as:
itemindex = numpy.where(array == item)
The result is a tuple with first all the row indices, then all the column indices.
For example, if an array is two dimensions and it contained your item at two locations then
array[itemindex[0][0]][itemindex[1][0]]
would be equal to your item and so would be:
array[itemindex[0][1]][itemindex[1][1]]
Pandas dataframe: Remove secondary upcoming same value
You can find the index of the first 1
and set others to 0
:
mask = df['col2'].eq(1)
df.loc[mask & (df.index != mask.idxmax()), 'col2'] = 0
For better performance, see Efficiently return the index of the first value satisfying condition in array.
Related Topics
Retrieve List of Tasks in a Queue in Celery
Python Script to Copy Text to Clipboard
How to Convert CSV File to Multiline JSON
How to Create Key or Append an Element to Key
How to Send Cookies in a Post Request with the Python Requests Library
Python "Syntaxerror: Non-Ascii Character '\Xe2' in File"
Backporting Python 3 Open(Encoding="Utf-8") to Python 2
Python Memoising/Deferred Lookup Property Decorator
Plotting a Decision Boundary Separating 2 Classes Using Matplotlib's Pyplot
How to Determine a Point Is Between Two Other Points on a Line Segment
Postponing Functions in Python
How to Determine the Language of a Piece of Text
How to Create a "View" on a Python List