How to Count the Number of Occurrences of an Element in a List

How do I count the occurrences of a list item?

If you only want a single item's count, use the count method:

>>> [1, 2, 3, 4, 1, 4, 1].count(1)
3


Important: this is very slow if you are counting multiple different items

Each count call goes over the entire list of n elements. Calling count in a loop n times means n * n total checks, which can be catastrophic for performance.

If you want to count multiple items, use Counter, which only does n total checks.

How to count the frequency of the elements in an unordered list?

If the list is sorted, you can use groupby from the itertools standard library (if it isn't, you can just sort it first, although this takes O(n lg n) time):

from itertools import groupby

a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
[len(list(group)) for key, group in groupby(sorted(a))]

Output:

[4, 4, 2, 1, 2]

Count occurrences of elements in a list until a different element appears

itertools.groupby() is good for this. It will group the items by a key and return and iterator with the key, grouper containing items. You can convert the gouper to a list and take it's length:

from itertools import groupby

lst = [1,1,1,5,3,3,9,3,3,3,3,3]

def counts(lst):
for k, v in groupby(lst):
yield k
yield len(list(v))

list(counts(lst))
# [1, 3, 5, 1, 3, 2, 9, 1, 3, 5]

You can do it as a one-liner with chain too:

from itertools import groupby, chain

lst = [1,1,1,5,3,3,9,3,3,3,3,3]

list(chain.from_iterable((k, len(list(v))) for k, v in groupby(lst)))
# [1, 3, 5, 1, 3, 2, 9, 1, 3, 5]

If, for some reason, you want to do this the hard way, you need to keep track of the current count and the current thing you are counting and append when that thing changes:

lst = [1,1,1,5,3,3,9,3,3,3,3,3]

def makeCounts(lst):
if len(lst) == 0:
return lst
res = []
# take first element
cur = lst[0]
count = 1

for n in lst[1:]:
if n == cur:
# same item — increase count
count += 1
else:
# new item — reset count and save previous
res.extend([cur, count])
count = 1
cur = n

# don't forget last item
res.extend([cur, count])

return res

makeCounts(lst)
# [1, 3, 5, 1, 3, 2, 9, 1, 3, 5]

Fastest way to count number of occurrences in a Python list

a = ['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']
print a.count("1")

It's probably optimized heavily at the C level.

Edit: I randomly generated a large list.

In [8]: len(a)
Out[8]: 6339347

In [9]: %timeit a.count("1")
10 loops, best of 3: 86.4 ms per loop

Edit edit: This could be done with collections.Counter

a = Counter(your_list)
print a['1']

Using the same list in my last timing example

In [17]: %timeit Counter(a)['1']
1 loops, best of 3: 1.52 s per loop

My timing is simplistic and conditional on many different factors, but it gives you a good clue as to performance.

Here is some profiling

In [24]: profile.run("a.count('1')")
3 function calls in 0.091 seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.091 0.091 <string>:1(<module>)
1 0.091 0.091 0.091 0.091 {method 'count' of 'list' objects}

1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Prof
iler' objects}

In [25]: profile.run("b = Counter(a); b['1']")
6339356 function calls in 2.143 seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 2.143 2.143 <string>:1(<module>)
2 0.000 0.000 0.000 0.000 _weakrefset.py:68(__contains__)
1 0.000 0.000 0.000 0.000 abc.py:128(__instancecheck__)
1 0.000 0.000 2.143 2.143 collections.py:407(__init__)
1 1.788 1.788 2.143 2.143 collections.py:470(update)
1 0.000 0.000 0.000 0.000 {getattr}
1 0.000 0.000 0.000 0.000 {isinstance}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Prof
iler' objects}
6339347 0.356 0.000 0.356 0.000 {method 'get' of 'dict' objects}

count occurrences of list items in second list in python

If you just want to count the number of elements that are in both lists (and you don't need to know how many times they occur in the other list) you can just use:

count = len(set(a).intersection(set(b)))

Or identically:

count = len(set(a) & set(b))


Related Topics



Leave a reply



Submit