Is There a Numpy Function to Return the First Index of Something in an Array

Is there a NumPy function to return the first index of something in an array?

Yes, given an array, array, and a value, item to search for, you can use np.where as:

itemindex = numpy.where(array == item)

The result is a tuple with first all the row indices, then all the column indices.

For example, if an array is two dimensions and it contained your item at two locations then

array[itemindex[0][0]][itemindex[1][0]]

would be equal to your item and so would be:

array[itemindex[0][1]][itemindex[1][1]]

Index of element in NumPy array

Use np.where to get the indices where a given condition is True.

Examples:

For a 2D np.ndarray called a:

i, j = np.where(a == value) # when comparing arrays of integers

i, j = np.where(np.isclose(a, value)) # when comparing floating-point arrays

For a 1D array:

i, = np.where(a == value) # integers

i, = np.where(np.isclose(a, value)) # floating-point

Note that this also works for conditions like >=, <=, != and so forth...

You can also create a subclass of np.ndarray with an index() method:

class myarray(np.ndarray):
def __new__(cls, *args, **kwargs):
return np.array(*args, **kwargs).view(myarray)
def index(self, value):
return np.where(self == value)

Testing:

a = myarray([1,2,3,4,4,4,5,6,4,4,4])
a.index(4)
#(array([ 3, 4, 5, 8, 9, 10]),)

numpy - return first index of element in array

You can use np.argwhere to get the matching indices packed as a 2D array with each row holding indices for each match and then index into the first row, like so -

np.argwhere(zArray==match)[0]

Alternatively, faster one with argmax to get the index of the first match on a flattened version and np.unravel_index for per-dim indices tuple -

np.unravel_index((zArray==match).argmax(), zArray.shape)

Sample run -

In [100]: zArray
Out[100]:
array([[ 0, 1200, 5000], # different from sample for a generic one
[1320, 24, 5000],
[5000, 234, 5230]])

In [101]: match
Out[101]: 5000

In [102]: np.argwhere(zArray==match)[0]
Out[102]: array([0, 2])

In [103]: np.unravel_index((zArray==match).argmax(), zArray.shape)
Out[103]: (0, 2)

Runtime test -

In [104]: a = np.random.randint(0,100,(1000,1000))

In [105]: %timeit np.argwhere(a==50)[0]
100 loops, best of 3: 2.41 ms per loop

In [106]: %timeit np.unravel_index((a==50).argmax(), a.shape)
1000 loops, best of 3: 493 µs per loop

Numpy: find first index of value fast

There is a feature request for this scheduled for Numpy 2.0.0: https://github.com/numpy/numpy/issues/2269

Given the X numpy array, return True if any of its elements is zero

X.any() is an incorrect answer, it would fail on X = np.array([0]) for instance (incorrectly returning False).

A correct answer would be: ~X.all(). According to De Morgan's laws, ANY element is 0 is equivalent to NOT (ALL elements are (NOT 0)).

How does it work?

Numpy is doing a implicit conversion to boolean:

X = np.array([-1, 2, 0, -4, 5, 6, 0, 0, -9, 10])
# array([-1, 2, 0, -4, 5, 6, 0, 0, -9, 10])

# convert to boolean
# 0 is False, all other numbers are True
X.astype(bool)
# array([ True, True, False, True, True, True, False, False, True, True])

# are all values truthy (not 0 in this case)?
X.astype(bool).all()
# False

# get the boolean NOT
~X.astype(bool).all()
# True

Numpy first occurrence of value greater than existing value

This is a little faster (and looks nicer)

np.argmax(aa>5)

Since argmax will stop at the first True ("In case of multiple occurrences of the maximum values, the indices corresponding to the first occurrence are returned.") and doesn't save another list.

In [2]: N = 10000

In [3]: aa = np.arange(-N,N)

In [4]: timeit np.argmax(aa>N/2)
100000 loops, best of 3: 52.3 us per loop

In [5]: timeit np.where(aa>N/2)[0][0]
10000 loops, best of 3: 141 us per loop

In [6]: timeit np.nonzero(aa>N/2)[0][0]
10000 loops, best of 3: 142 us per loop

Is there an efficient way to pass all as a numpy index?

Recall that a[:] returns the contents of a (even if a is multidimensional). We cannot store the : in the mask variable, but we can use a slice object equivalently:

def func(use_mask):
a = numpy.arange(10)
if use_mask:
mask = a % 2 == 0
else:
mask = slice(None)
return a[mask]

This does not use any memory to create the index array. I'm not sure what the CPU usage of the a[slice(None)] operation is, though.



Related Topics



Leave a reply



Submit