What's the shortest way to count the number of items in a generator/iterator?
Calls to itertools.imap()
in Python 2 or map()
in Python 3 can be replaced by equivalent generator expressions:
sum(1 for dummy in it)
This also uses a lazy generator, so it avoids materializing a full list of all iterator elements in memory.
Length of generator output
There isn't one because you can't do it in the general case - what if you have a lazy infinite generator? For example:
def fib():
a, b = 0, 1
while True:
a, b = b, a + b
yield a
This never terminates but will generate the Fibonacci numbers. You can get as many Fibonacci numbers as you want by calling next()
.
If you really need to know the number of items there are, then you can't iterate through them linearly one time anyway, so just use a different data structure such as a regular list.
How to len(generator())
Generators have no length, they aren't collections after all.
Generators are functions with a internal state (and fancy syntax). You can repeatedly call them to get a sequence of values, so you can use them in loop. But they don't contain any elements, so asking for the length of a generator is like asking for the length of a function.
if functions in Python are objects, couldn't I assign the length to a
variable of this object that would be accessible to the new generator?
Functions are objects, but you cannot assign new attributes to them. The reason is probably to keep such a basic object as efficient as possible.
You can however simply return (generator, length)
pairs from your functions or wrap the generator in a simple object like this:
class GeneratorLen(object):
def __init__(self, gen, length):
self.gen = gen
self.length = length
def __len__(self):
return self.length
def __iter__(self):
return self.gen
g = some_generator()
h = GeneratorLen(g, 1)
print len(h), list(h)
How to count the items in a generator consumed by other code
Here is another way using itertools.count()
example:
import itertools
def generator():
for i in range(10):
yield i
def process(l):
for i in l:
if i == 5:
break
def counter_value(counter):
import re
return int(re.search('\d+', repr(counter)).group(0))
counter = itertools.count()
process(i for i, v in itertools.izip(generator(), counter))
print "Element consumed by process is : %d " % counter_value(counter)
# output: Element consumed by process is : 6
Hope this was helpful.
Getting number of elements in an iterator in Python
No. It's not possible.
Example:
import random
def gen(n):
for i in xrange(n):
if random.randint(0, 1) == 0:
yield i
iterator = gen(10)
Length of iterator
is unknown until you iterate through it.
Need a fast way to count and sum an iterable in a single pass
Thanks for all the great answers, but I decided to use my original count_and_sum
function, called as follows:
>>> cc, cs = count_and_sum(c.width for c in cols if not c.hide)
As explained in the edits to my original question this turned out to be the fastest and most readable solution.
Is there any built-in way to get the length of an iterable in python?
Short of iterating through the iterable and counting the number of iterations, no. That's what makes it an iterable and not a list. This isn't really even a python-specific problem. Look at the classic linked-list data structure. Finding the length is an O(n) operation that involves iterating the whole list to find the number of elements.
As mcrute mentioned above, you can probably reduce your function to:
def count_iterable(i):
return sum(1 for e in i)
Of course, if you're defining your own iterable object you can always implement __len__
yourself and keep an element count somewhere.
Python - Count Elements in Iterator Without Consuming
I have not been able to come up with an exact solution (because iterators may be immutable types), but here are my best attempts. I believe the second should be faster, according to the documentation (final paragraph of itertools.tee
).
Option 1
def it_count(it):
tmp_it, new_it = itertools.tee(it)
return sum(1 for _ in tmp_it), new_it
Option 2
def it_count2(it):
lst = list(it)
return len(lst), lst
It functions well, but has the slight annoyance of returning the pair rather than simply the count.
ita = iter([1, 2, 3])
count, ita = it_count(ita)
print(count)
Output: 3
count, ita = it_count2(ita)
print(count)
Output: 3
count, ita = it_count(ita)
print(count)
Output: 3
print(list(ita))
Output: [1, 2, 3]
Related Topics
How Many Concurrent Requests Does a Single Flask Process Receive
Django Datetime Issues (Default=Datetime.Now())
Get Previous Row's Value and Calculate New Column Pandas Python
How to Share Variables Across Scripts in Python
Beautifulsoup - Search by Text Inside a Tag
Handle Flask Requests Concurrently with Threaded=True
Python Eval: Is It Still Dangerous If I Disable Builtins and Attribute Access
How to Efficiently Calculate a Running Standard Deviation
How to Perform HTML Decoding/Encoding Using Python/Django
Calculating Arithmetic Mean (One Type of Average) in Python
Angles Between Two N-Dimensional Vectors in Python
Creating a Bat File for Python Script
Failed to Get Convolution Algorithm. This Is Probably Because Cudnn Failed to Initialize,
Reference List Item by Index Within Django Template
How to Append Data to a JSON File