﻿ Quickest Way to Find the Nth Largest Value in a Numpy Matrix - ITCodar

# Quickest Way to Find the Nth Largest Value in a Numpy Matrix

## Quickest way to find the nth largest value in a numpy Matrix

You can flatten the matrix and then sort it:

``>>> k = np.array([[ 35,  48,  63],...        [ 60,  77,  96],...        [ 91, 112, 135]])>>> flat=k.flatten()>>> flat.sort()>>> flatarray([ 35,  48,  60,  63,  77,  91,  96, 112, 135])>>> flat[-2]112>>> flat[-3]96``

## A fast way to find the largest N elements in an numpy array

The `bottleneck` module has a fast partial sort method that works directly with Numpy arrays: `bottleneck.partition()`.

Note that `bottleneck.partition()` returns the actual values sorted, if you want the indexes of the sorted values (what `numpy.argsort()` returns) you should use `bottleneck.argpartition()`.

I've benchmarked:

• `z = -bottleneck.partition(-a, 10)[:10]`
• `z = a.argsort()[-10:]`
• `z = heapq.nlargest(10, a)`

where `a` is a random 1,000,000-element array.

The timings were as follows:

• `bottleneck.partition()`: 25.6 ms per loop
• `np.argsort()`: 198 ms per loop
• `heapq.nlargest()`: 358 ms per loop

## Numpy: Find index of second highest value in each row of an ndarray

The amazing `numpy.argsort()` function makes this task really simple. Once the sorted indices are found, get the second to last column.

``m = np.array([[101,   0,   1,   0,   0,   0,   1,   1,   2,   0],              [  0, 116,   1,   0,   0,   0,   0,   0,   1,   0],              [  1,   4,  84,   2,   2,   0,   2,   4,   6,   1],              [  0,   2,   0,  84,   0,   6,   0,   2,   3,   0],              [  0,   0,   1,   0,  78,   0,   0,   2,   0,  11],              [  2,   0,   0,   1,   1,  77,   5,   0,   2,   0],              [  1,   2,   1,   0,   1,   2,  94,   0,   1,   0],              [  0,   1,   1,   0,   0,   0,   0,  96,   0,   4],              [  1,   5,   4,   3,   1,   3,   0,   1,  72,   4],              [  0,   1,   0,   0,   3,   2,   0,   7,   0,  82]])# Get index for the second highest value.m.argsort()[:,-2]``

### Output:

``array([8, 8, 8, 5, 9, 6, 5, 9, 1, 7], dtype=int32)``

## How do I get indices of N maximum values in a NumPy array?

Newer NumPy versions (1.8 and up) have a function called `argpartition` for this. To get the indices of the four largest elements, do

``>>> a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])>>> aarray([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])>>> ind = np.argpartition(a, -4)[-4:]>>> indarray([1, 5, 8, 0])>>> top4 = a[ind]>>> top4array([4, 9, 6, 9])``

Unlike `argsort`, this function runs in linear time in the worst case, but the returned indices are not sorted, as can be seen from the result of evaluating `a[ind]`. If you need that too, sort them afterwards:

``>>> ind[np.argsort(a[ind])]array([1, 8, 5, 0])``

To get the top-k elements in sorted order in this way takes O(n + k log k) time.

## Numpy get index of row with second-largest value

You can use `np.argsort(np.max(x, axis=0))[-2]`.

This scales to any index you want by changing the slicing index from `-2` to `-index`.

## Get the indices of N highest values in an ndarray

You can use `numpy.argpartition` on flattened version of array first to get the indices of top `k` items, and then you can convert those 1D indices as per the array's shape using `numpy.unravel_index`:

``>>> arr = np.arange(100*100*100).reshape(100, 100, 100)>>> np.random.shuffle(arr)>>> indices =  np.argpartition(arr.flatten(), -2)[-2:]>>> np.vstack(np.unravel_index(indices, arr.shape)).Tarray([[97, 99, 98],       [97, 99, 99]]))>>> arr[97][99][98]999998>>> arr[97][99][99]999999``

## how to get the index of the largest n values in a multi-dimensional numpy array

I don't have access to `bottleneck`, so in this example I am using `argsort`, but you should be able to use it in the same way:

``#!/usr/bin/env pythonimport numpy as npN = 4a = np.random.random(20).reshape(4, 5)print(a)# Convert it into a 1D arraya_1d = a.flatten()# Find the indices in the 1D arrayidx_1d = a_1d.argsort()[-N:]# convert the idx_1d back into indices arrays for each dimensionx_idx, y_idx = np.unravel_index(idx_1d, a.shape)# Check that we got the largest values.for x, y, in zip(x_idx, y_idx):    print(a[x][y])``

## Numpy array chosing second largest number and assign it to a list in python

You can use `argsort`:

``List1 = np.argsort(a)[:,-2].tolist()``