Find the Most Frequent Number in a Numpy Array

Find the most frequent number in a NumPy array

If your list contains all non-negative ints, you should take a look at numpy.bincounts:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.bincount.html

and then probably use np.argmax:

a = np.array([1,2,3,1,2,1,1,1,3,2,2,1])
counts = np.bincount(a)
print(np.argmax(counts))

For a more complicated list (that perhaps contains negative numbers or non-integer values), you can use np.histogram in a similar way. Alternatively, if you just want to work in python without using numpy, collections.Counter is a good way of handling this sort of data.

from collections import Counter
a = [1,2,3,1,2,1,1,1,3,2,2,1]
b = Counter(a)
print(b.most_common(1))

Most frequent element in NumPy ND array

Given:

>>> import numpy as np
>>> LoL = [[2,4,1,6,3], [2,4,1,8,4], [6,5,4,3,2], [6,5,4,3,4], [1,2,3,4,5]]
>>> matrix=np.array(LoL)
>>> matrix
[[2 4 1 6 3]
[2 4 1 8 4]
[6 5 4 3 2]
[6 5 4 3 4]
[1 2 3 4 5]]

You can do:

>>> np.argmax(np.bincount(matrix.flat))
4

Or,

u, c = np.unique(your_lst, return_counts=True)
u[c.argmax()]
# 4

If you wanted to do this without numpy or any import to count the most frequent entry in a list of lists, you can use a dictionary to count each element from a generator that is flattening your list of lists:

cnts={}
for e in (x for sl in LoL for x in sl):
cnts[e]=cnts.get(e, 0)+1

Then sort by most frequent:

>>> sorted(cnts.items(), key=lambda t: t[1], reverse=True)
[(4, 7), (2, 4), (3, 4), (1, 3), (5, 3), (6, 3), (8, 1)]

Or, just use max if you only want the largest:

>>> max(cnts.items(), key=lambda t: t[1])

How to find most frequent values in numpy ndarray?

To find the most frequent value of a flat array, use unique, bincount and argmax:

arr = np.array([5, 4, -2, 1, -2, 0, 4, 4, -6, -1])
u, indices = np.unique(arr, return_inverse=True)
u[np.argmax(np.bincount(indices))]

To work with a multidimensional array, we don't need to worry about unique, but we do need to use apply_along_axis on bincount:

arr = np.array([[5, 4, -2, 1, -2, 0, 4, 4, -6, -1],
[0, 1, 2, 2, 3, 4, 5, 6, 7, 8]])
axis = 1
u, indices = np.unique(arr, return_inverse=True)
u[np.argmax(np.apply_along_axis(np.bincount, axis, indices.reshape(arr.shape),
None, np.max(indices) + 1), axis=axis)]

With your data:

data = np.array([
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],

[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],

[[40, 40, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
axis = 0
u, indices = np.unique(arr, return_inverse=True)
u[np.argmax(np.apply_along_axis(np.bincount, axis, indices.reshape(arr.shape),
None, np.max(indices) + 1), axis=axis)]
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])

NumPy 1.2, really? You can approximate np.unique(return_inverse=True) reasonably efficiently using np.searchsorted (it's an additional O(n log n), so shouldn't change the performance significantly):

u = np.unique(arr)
indices = np.searchsorted(u, arr.flat)

Find most common value in numpy 2d array rows, otherwise return maximum

I believe this will solve the problem. You could probable make it into a one liner with some fancy list comprehension, but I don't think that would be worth while.

most_f = []
for n in Nbank: #iterate over elements
counts = np.bincount(n) #count the number of elements of each value
most_f.append(np.argwhere(counts == np.max(counts))[-1][0]) #append the last and highest

Numpy: How to find the most frequent nonzero values in array?

We can use numpy.apply_along_axis and a simple function to solve this. Here, we make use of numpy.bincount to count the occurrences of numeric values and then numpy.argmax to get the highest occurrence. If there are no other values than exclude, we return it.

Code:

def get_freq(array, exclude):
count = np.bincount(array[array != exclude])
if count.size == 0:
return exclude
else:
return np.argmax(count)

np.apply_along_axis(lambda x: get_freq(x, 0), axis=2, arr=arr)

Output:

array([[3, 2, 0, 1]])

Please note, that it will also return exclude if you pass an empty array.

EDIT:
As Ehsan noted, above solution will not work for negative values in the given array. For this case, use Counter from collections:

arr = np.array([[[ 0,  -3,  0,  3,  0],
[ 0, 0, 2, 3, 2],
[ 0, 0, 0, 0, 0],
[ 2, -5, 0, -5, 0]]])

from collections import Counter

def get_freq(array, exclude):
count = Counter(array[array != exclude]).most_common(1)
if not count:
return exclude
else:
return count[0][0]

Output:

array([[-3,  2,  0, -5]])

most_common(1) returns the most occurring value in the Counter object as one element list with a tuple in which first element is the value, and second is its number of occurrences. This is returned as a list, thus the double indexing. If list is empty, then most_common has not found any occurrences (either only exclude or empty).

NumPy - Find most common value in array, use largest value in case of a tie

You could apply argmax to the reversed output of bincount, and then adjust to take into account the reversal:

In [73]: x
Out[73]: array([3, 0, 2, 3, 1, 0, 1, 3, 2, 1, 1, 2, 1, 3, 3, 4])

In [74]: b = np.bincount(x)

In [75]: b
Out[75]: array([2, 5, 3, 5, 1])

In [76]: most_common = len(b) - 1 - b[::-1].argmax()

In [77]: most_common
Out[77]: 3

how to find most frequent string element in numpy ndarray?

If you want a numpy answer you can use np.unique:

>>> unique,pos = np.unique(A,return_inverse=True) #Finds all unique elements and their positions
>>> counts = np.bincount(pos) #Count the number of each unique element
>>> maxpos = counts.argmax() #Finds the positions of the maximum count

>>> (unique[maxpos],counts[maxpos])
('d', 2)

Although if there are two elements with equal counts this will simply take the first from the unique array.

With this you can also easily sort by element count like so:

>>> maxsort = counts.argsort()[::-1]
>>> (unique[maxsort],counts[maxsort])
(array(['d', 'e', 'c', 'b', 'a'],
dtype='|S1'), array([2, 1, 1, 1, 1]))

Numpy - find most common item per row

Supposing m is the name of your matrix:

most_f = np.array([np.bincount(row).argmax() for row in m])

I hope this solves your question



Related Topics



Leave a reply



Submit