Quickest Way to Find the Nth Largest Value in a Numpy Matrix

Quickest way to find the nth largest value in a numpy Matrix

You can flatten the matrix and then sort it:

>>> k = np.array([[ 35,  48,  63],
... [ 60, 77, 96],
... [ 91, 112, 135]])
>>> flat=k.flatten()
>>> flat.sort()
>>> flat
array([ 35, 48, 60, 63, 77, 91, 96, 112, 135])
>>> flat[-2]
112
>>> flat[-3]
96

A fast way to find the largest N elements in an numpy array

The bottleneck module has a fast partial sort method that works directly with Numpy arrays: bottleneck.partition().

Note that bottleneck.partition() returns the actual values sorted, if you want the indexes of the sorted values (what numpy.argsort() returns) you should use bottleneck.argpartition().

I've benchmarked:

  • z = -bottleneck.partition(-a, 10)[:10]
  • z = a.argsort()[-10:]
  • z = heapq.nlargest(10, a)

where a is a random 1,000,000-element array.

The timings were as follows:

  • bottleneck.partition(): 25.6 ms per loop
  • np.argsort(): 198 ms per loop
  • heapq.nlargest(): 358 ms per loop

Numpy: Find index of second highest value in each row of an ndarray

The amazing numpy.argsort() function makes this task really simple. Once the sorted indices are found, get the second to last column.

m = np.array([[101,   0,   1,   0,   0,   0,   1,   1,   2,   0],
[ 0, 116, 1, 0, 0, 0, 0, 0, 1, 0],
[ 1, 4, 84, 2, 2, 0, 2, 4, 6, 1],
[ 0, 2, 0, 84, 0, 6, 0, 2, 3, 0],
[ 0, 0, 1, 0, 78, 0, 0, 2, 0, 11],
[ 2, 0, 0, 1, 1, 77, 5, 0, 2, 0],
[ 1, 2, 1, 0, 1, 2, 94, 0, 1, 0],
[ 0, 1, 1, 0, 0, 0, 0, 96, 0, 4],
[ 1, 5, 4, 3, 1, 3, 0, 1, 72, 4],
[ 0, 1, 0, 0, 3, 2, 0, 7, 0, 82]])

# Get index for the second highest value.
m.argsort()[:,-2]

Output:

array([8, 8, 8, 5, 9, 6, 5, 9, 1, 7], dtype=int32)

How do I get indices of N maximum values in a NumPy array?

Newer NumPy versions (1.8 and up) have a function called argpartition for this. To get the indices of the four largest elements, do

>>> a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
>>> a
array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])

>>> ind = np.argpartition(a, -4)[-4:]
>>> ind
array([1, 5, 8, 0])

>>> top4 = a[ind]
>>> top4
array([4, 9, 6, 9])

Unlike argsort, this function runs in linear time in the worst case, but the returned indices are not sorted, as can be seen from the result of evaluating a[ind]. If you need that too, sort them afterwards:

>>> ind[np.argsort(a[ind])]
array([1, 8, 5, 0])

To get the top-k elements in sorted order in this way takes O(n + k log k) time.

Numpy get index of row with second-largest value

You can use np.argsort(np.max(x, axis=0))[-2].

This scales to any index you want by changing the slicing index from -2 to -index.

Get the indices of N highest values in an ndarray

You can use numpy.argpartition on flattened version of array first to get the indices of top k items, and then you can convert those 1D indices as per the array's shape using numpy.unravel_index:

>>> arr = np.arange(100*100*100).reshape(100, 100, 100)
>>> np.random.shuffle(arr)
>>> indices = np.argpartition(arr.flatten(), -2)[-2:]
>>> np.vstack(np.unravel_index(indices, arr.shape)).T
array([[97, 99, 98],
[97, 99, 99]])
)
>>> arr[97][99][98]
999998
>>> arr[97][99][99]
999999

how to get the index of the largest n values in a multi-dimensional numpy array

I don't have access to bottleneck, so in this example I am using argsort, but you should be able to use it in the same way:

#!/usr/bin/env python
import numpy as np
N = 4
a = np.random.random(20).reshape(4, 5)
print(a)

# Convert it into a 1D array
a_1d = a.flatten()

# Find the indices in the 1D array
idx_1d = a_1d.argsort()[-N:]

# convert the idx_1d back into indices arrays for each dimension
x_idx, y_idx = np.unravel_index(idx_1d, a.shape)

# Check that we got the largest values.
for x, y, in zip(x_idx, y_idx):
print(a[x][y])

Numpy array chosing second largest number and assign it to a list in python

You can use argsort:

List1 = np.argsort(a)[:,-2].tolist()


Related Topics



Leave a reply



Submit