How to Prevent Iterator Getting Exhausted

How to prevent iterator getting exhausted?

There's no way to "reset" a generator. However, you can use itertools.tee to "copy" an iterator.

>>> z = zip(a, b)
>>> zip1, zip2 = itertools.tee(z)
>>> list(zip1)
[(1, 7), (2, 8), (3, 9)]
>>> list(zip2)
[(1, 7), (2, 8), (3, 9)]

This involves caching values, so it only makes sense if you're iterating through both iterables at about the same rate. (In other words, don't use it the way I have here!)

Another approach is to pass around the generator function, and call it whenever you want to iterate it.

def gen(x):
for i in range(x):
yield i ** 2

def make_two_lists(gen):
return list(gen()), list(gen())

But now you have to bind the arguments to the generator function when you pass it. You can use lambda for that, but a lot of people find lambda ugly. (Not me though! YMMV.)

>>> make_two_lists(lambda: gen(10))
([0, 1, 4, 9, 16, 25, 36, 49, 64, 81], [0, 1, 4, 9, 16, 25, 36, 49, 64, 81])

I hope it goes without saying that under most circumstances, it's better just to make a list and copy it.

Also, as a more general way of explaining this behavior, consider this. The point of a generator is to produce a series of values, while maintaining some state between iterations. Now, at times, instead of simply iterating over a generator, you might want to do something like this:

z = zip(a, b)
while some_condition():
fst = next(z, None)
snd = next(z, None)
do_some_things(fst, snd)
if fst is None and snd is None:
do_some_other_things()

Let's say this loop may or may not exhaust z. Now we have a generator in an indeterminate state! So it's important, at this point, that the behavior of a generator is restrained in a well-defined way. Although we don't know where the generator is in its output, we know that a) all subsequent accesses will produce later values in the series, and b) once it's "empty", we've gotten all the items in the series exactly once. The more ability we have to manipulate the state of z, the harder it is to reason about it, so it's best that we avoid situations that break those two promises.

Of course, as Joel Cornett points out below, it is possible to write a generator that accepts messages via the send method; and it would be possible to write a generator that could be reset using send. But note that in that case, all we can do is send a message. We can't directly manipulate the generator's state, and so all changes to the state of the generator are well-defined (by the generator itself -- assuming it was written correctly!). send is really for implementing coroutines, so I wouldn't use it for this purpose. Everyday generators almost never do anything with values sent to them -- I think for the very reasons I give above.

exhausted iterators - what to do about them?

you can convert an iterator to a tuple simply by calling tuple(iterator)

however I'd rewrite that filter as a list comprehension, which would look something like this

# original
filtered = filter(lambda x : x is not None and x != 0, c)

# list comp
filtered = [x for x in c if x is not None and x != 0]

Continue to other generators once a generator has been exhausted in a list of generators?

This works. I tried to stay close to how your original code works (though I did replace your first loop with a list comprehension for simplicity).

def alternate_all(*args):
iter_list = [iter(arg) for arg in args]
while iter_list:
i = iter_list.pop(0)
try:
val = next(i)
except StopIteration:
pass
else:
yield val
iter_list.append(i)

The main problem with your code was that your try/except was outside of the loop, meaning the first exhausted iterator would exit from the loop. Instead, you want to catch StopIteration inside the loop so you can keep going, and the loop should keep going while iter_list still has any iterators in it.

Why does an iterator need to be exhausted and discarded?

The Iterator object in the process of its work can acquire or create certain resources. This is file system handles in case of scandir.

Therefore, it is desirable that it be exhausted because in this case it will be able to correctly and timely release these resources. But this is not necessary, sometimes there are situations when you need to interrupt the iterating. In this case, we must explicitly tell the object that we no longer want to iterate it and that it can free resources.

It is better to do this explicitly than to rely on the garbage collector, this gives greater control over the program flow and can prevent you from unexpected errors.

Why exhausted generators raise StopIteration more than once?

Perhaps there's an important use case for calling exhausted generators multiple times and getting StopIteration?

There is, specifically, when you want to perform multiple loops on the same iterator. Here's an example from the itertools docs that relies on this behavior:

def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)


Related Topics



Leave a reply



Submit