Find the most frequent number in a NumPy array
If your list contains all non-negative ints, you should take a look at numpy.bincounts:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.bincount.html
and then probably use np.argmax:
a = np.array([1,2,3,1,2,1,1,1,3,2,2,1])
counts = np.bincount(a)
print(np.argmax(counts))
For a more complicated list (that perhaps contains negative numbers or non-integer values), you can use np.histogram
in a similar way. Alternatively, if you just want to work in python without using numpy, collections.Counter
is a good way of handling this sort of data.
from collections import Counter
a = [1,2,3,1,2,1,1,1,3,2,2,1]
b = Counter(a)
print(b.most_common(1))
Most frequent element in NumPy ND array
Given:
>>> import numpy as np
>>> LoL = [[2,4,1,6,3], [2,4,1,8,4], [6,5,4,3,2], [6,5,4,3,4], [1,2,3,4,5]]
>>> matrix=np.array(LoL)
>>> matrix
[[2 4 1 6 3]
[2 4 1 8 4]
[6 5 4 3 2]
[6 5 4 3 4]
[1 2 3 4 5]]
You can do:
>>> np.argmax(np.bincount(matrix.flat))
4
Or,
u, c = np.unique(your_lst, return_counts=True)
u[c.argmax()]
# 4
If you wanted to do this without numpy or any import to count the most frequent entry in a list of lists, you can use a dictionary to count each element from a generator that is flattening your list of lists:
cnts={}
for e in (x for sl in LoL for x in sl):
cnts[e]=cnts.get(e, 0)+1
Then sort by most frequent:
>>> sorted(cnts.items(), key=lambda t: t[1], reverse=True)
[(4, 7), (2, 4), (3, 4), (1, 3), (5, 3), (6, 3), (8, 1)]
Or, just use max
if you only want the largest:
>>> max(cnts.items(), key=lambda t: t[1])
How to find most frequent values in numpy ndarray?
To find the most frequent value of a flat array, use unique
, bincount
and argmax
:
arr = np.array([5, 4, -2, 1, -2, 0, 4, 4, -6, -1])
u, indices = np.unique(arr, return_inverse=True)
u[np.argmax(np.bincount(indices))]
To work with a multidimensional array, we don't need to worry about unique
, but we do need to use apply_along_axis
on bincount
:
arr = np.array([[5, 4, -2, 1, -2, 0, 4, 4, -6, -1],
[0, 1, 2, 2, 3, 4, 5, 6, 7, 8]])
axis = 1
u, indices = np.unique(arr, return_inverse=True)
u[np.argmax(np.apply_along_axis(np.bincount, axis, indices.reshape(arr.shape),
None, np.max(indices) + 1), axis=axis)]
With your data:
data = np.array([
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[40, 40, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
axis = 0
u, indices = np.unique(arr, return_inverse=True)
u[np.argmax(np.apply_along_axis(np.bincount, axis, indices.reshape(arr.shape),
None, np.max(indices) + 1), axis=axis)]
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
NumPy 1.2, really? You can approximate np.unique(return_inverse=True)
reasonably efficiently using np.searchsorted
(it's an additional O(n log n), so shouldn't change the performance significantly):
u = np.unique(arr)
indices = np.searchsorted(u, arr.flat)
Find most common value in numpy 2d array rows, otherwise return maximum
I believe this will solve the problem. You could probable make it into a one liner with some fancy list comprehension, but I don't think that would be worth while.
most_f = []
for n in Nbank: #iterate over elements
counts = np.bincount(n) #count the number of elements of each value
most_f.append(np.argwhere(counts == np.max(counts))[-1][0]) #append the last and highest
Numpy: How to find the most frequent nonzero values in array?
We can use numpy.apply_along_axis and a simple function to solve this. Here, we make use of numpy.bincount to count the occurrences of numeric values and then numpy.argmax to get the highest occurrence. If there are no other values than exclude
, we return it.
Code:
def get_freq(array, exclude):
count = np.bincount(array[array != exclude])
if count.size == 0:
return exclude
else:
return np.argmax(count)
np.apply_along_axis(lambda x: get_freq(x, 0), axis=2, arr=arr)
Output:
array([[3, 2, 0, 1]])
Please note, that it will also return exclude
if you pass an empty array.
EDIT:
As Ehsan noted, above solution will not work for negative values in the given array. For this case, use Counter
from collections:
arr = np.array([[[ 0, -3, 0, 3, 0],
[ 0, 0, 2, 3, 2],
[ 0, 0, 0, 0, 0],
[ 2, -5, 0, -5, 0]]])
from collections import Counter
def get_freq(array, exclude):
count = Counter(array[array != exclude]).most_common(1)
if not count:
return exclude
else:
return count[0][0]
Output:
array([[-3, 2, 0, -5]])
most_common(1)
returns the most occurring value in the Counter
object as one element list with a tuple in which first element is the value, and second is its number of occurrences. This is returned as a list, thus the double indexing. If list is empty, then most_common
has not found any occurrences (either only exclude or empty).
NumPy - Find most common value in array, use largest value in case of a tie
You could apply argmax
to the reversed output of bincount
, and then adjust to take into account the reversal:
In [73]: x
Out[73]: array([3, 0, 2, 3, 1, 0, 1, 3, 2, 1, 1, 2, 1, 3, 3, 4])
In [74]: b = np.bincount(x)
In [75]: b
Out[75]: array([2, 5, 3, 5, 1])
In [76]: most_common = len(b) - 1 - b[::-1].argmax()
In [77]: most_common
Out[77]: 3
how to find most frequent string element in numpy ndarray?
If you want a numpy answer you can use np.unique
:
>>> unique,pos = np.unique(A,return_inverse=True) #Finds all unique elements and their positions
>>> counts = np.bincount(pos) #Count the number of each unique element
>>> maxpos = counts.argmax() #Finds the positions of the maximum count
>>> (unique[maxpos],counts[maxpos])
('d', 2)
Although if there are two elements with equal counts this will simply take the first from the unique
array.
With this you can also easily sort by element count like so:
>>> maxsort = counts.argsort()[::-1]
>>> (unique[maxsort],counts[maxsort])
(array(['d', 'e', 'c', 'b', 'a'],
dtype='|S1'), array([2, 1, 1, 1, 1]))
Numpy - find most common item per row
Supposing m is the name of your matrix:
most_f = np.array([np.bincount(row).argmax() for row in m])
I hope this solves your question
Related Topics
How to Filter a Django Query with a List of Values
Correct Way to Implement a Custom Popup Tkinter Dialog Box
How to Convert a File to Utf-8 in Python
E731 Do Not Assign a Lambda Expression, Use a Def
Simple Python Challenge: Fastest Bitwise Xor on Data Buffers
Add Column to Dataframe with Constant Value
Check If String Has Date, Any Format
Convert Row to Column Header for Pandas Dataframe,
Python Postgres Psycopg2 Threadedconnectionpool Exhausted
Pandas: Looking Up the List of Sheets in an Excel File
Retrieving a Foreign Key Value with Django-Rest-Framework Serializers
How to Write Binary Data to Stdout in Python 3
Parsing a JSON String Which Was Loaded from a CSV Using Pandas
Count Unique Values Using Pandas Groupby
Add Sum of Values of Two Lists into New List