Return in generator together with yield
This is a new feature in Python 3.3 (as a comment notes, it doesn't even work in 3.2). Much like return
in a generator has long been equivalent to raise StopIteration()
, return <something>
in a generator is now equivalent to raise StopIteration(<something>)
. For that reason, the exception you're seeing should be printed as StopIteration: 3
, and the value is accessible through the attribute value
on the exception object. If the generator is delegated to using the (also new) yield from
syntax, it is the result. See PEP 380 for details.
def f():
return 1
yield 2
def g():
x = yield from f()
print(x)
# g is still a generator so we need to iterate to run it:
for _ in g():
pass
This prints 1
, but not 2
.
Best way of getting both the yield'ed output and return'ed value of a generator in Python
without wrapping it inside another class?
Maybe with just a function instead?
Version 1:
def output_and_return(it):
def with_result():
yield (yield from it)
*elements, result = with_result()
return elements, result
Version 2:
def output_and_return(it):
result = None
def get_result():
nonlocal result
result = yield from it
return list(get_result()), result
Mixing yield and return. `yield [cand]; return` vs `return [[cand]]`. Why do they lead to different output?
In a generator function, return
just defines the value associated with the StopIteration
exception implicitly raised to indicate an iterator is exhausted. It's not produced during iteration, and most iterating constructs (e.g. for
loops) intentionally ignore the StopIteration
exception (it means the loop is over, you don't care if someone attached random garbage to a message that just means "we're done").
For example, try:
>>> def foo():
... yield 'onlyvalue' # Existence of yield keyword makes this a generator
... return 'returnvalue'
...
>>> f = foo() # Makes a generator object, stores it in f
>>> next(f) # Pull one value from generator
'onlyvalue'
>>> next(f) # There is no other yielded value, so this hits the return; iteration over
--------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
...
StopIteration: 'returnvalue'
As you can see, your return
value does get "returned" in a sense (it's not completely discarded), but it's never seen by anything iterating normally, so it's largely useless. Outside of rare cases involving using generators as coroutines (where you're using .send()
and .throw()
on instances of the generator and manually advancing it with next(genobj)
), the return value of a generator won't be seen.
In short, you have to pick one:
- Use
yield
anywhere in a function, and it's a generator (whether or not the code path of a particular call ever reaches ayield
) andreturn
just ends generation (while maybe hiding some data in theStopIteration
exception). No matter what you do, calling the generator function "returns" a new generator object (which you can loop over until exhausted), it can never return a raw value computed inside the generator function (which doesn't even begin running until you loop over it at least once). - Don't use
yield
, andreturn
works as expected (because it's not a generator function).
As an example to explain what happens to the return
value in normal looping constructs, this is what for x in gen():
effectively expands to a C optimized version of:
__unnamed_iterator = iter(gen())
while True:
try:
x = next(__unnamed_iterator)
except StopIteration: # StopIteration caught here without inspecting it
break # Loop ends, StopIteration exception cleaned even from sys.exc_info() to avoid possible reference cycles
# body of loop goes here
# Outside of loop, there is no StopIteration object left
As you can see, the expanded form of the for
loop has to look for a StopIteration
to indicate the loop is over, but it doesn't use it. And for anything that's not a generator, the StopIteration
never has any associated values; the for
loop has no way to report them even if it did (it has to end the loop when it's told iteration is over, and the arguments to StopIteration
are explicitly not part of the values iterated anyway). Anything else that consumes the generator (e.g. calling list
on it) is doing roughly the same thing as the for
loop, ignoring the StopIteration
in the same way; nothing except code that specifically expects generators (as opposed to more generalized iterables and iterators) will ever bother to inspect the StopIteration
object (at the C layer, there are optimizations that StopIteration
objects aren't even produced by most iterators; they return NULL
and leave the set exception empty, which all iterator protocol using things know is equivalent to returning NULL
and setting a StopIteration
object, so for anything but a generator, there isn't even an exception to inspect much of the time).
Python `yield from`, or return a generator?
The difference is that your first mymap
is just a usual function,
in this case a factory which returns a generator. Everything
inside the body gets executed as soon as you call the function.
def gen_factory(func, seq):
"""Generator factory returning a generator."""
# do stuff ... immediately when factory gets called
print("build generator & return")
return (func(*args) for args in seq)
The second mymap
is also a factory, but it's also a generator
itself, yielding from a self-built sub-generator inside.
Because it is a generator itself, execution of the body does
not start until the first invokation of next(generator).
def gen_generator(func, seq):
"""Generator yielding from sub-generator inside."""
# do stuff ... first time when 'next' gets called
print("build generator & yield")
yield from (func(*args) for args in seq)
I think the following example will make it clearer.
We define data packages which shall be processed with functions,
bundled up in jobs we pass to the generators.
def add(a, b):
return a + b
def sqrt(a):
return a ** 0.5
data1 = [*zip(range(1, 5))] # [(1,), (2,), (3,), (4,)]
data2 = [(2, 1), (3, 1), (4, 1), (5, 1)]
job1 = (sqrt, data1)
job2 = (add, data2)
Now we run the following code inside an interactive shell like IPython to
see the different behavior. gen_factory
immediately prints
out, while gen_generator
only does so after next()
being called.
gen_fac = gen_factory(*job1)
# build generator & return <-- printed immediately
next(gen_fac) # start
# Out: 1.0
[*gen_fac] # deplete rest of generator
# Out: [1.4142135623730951, 1.7320508075688772, 2.0]
gen_gen = gen_generator(*job1)
next(gen_gen) # start
# build generator & yield <-- printed with first next()
# Out: 1.0
[*gen_gen] # deplete rest of generator
# Out: [1.4142135623730951, 1.7320508075688772, 2.0]
To give you a more reasonable use case example for a construct
like gen_generator
we'll extend it a little and make a coroutine
out of it by assigning yield to variables, so we can inject jobs
into the running generator with send()
.
Additionally we create a helper function which will run all tasks
inside a job and ask as for a new one upon completion.
def gen_coroutine():
"""Generator coroutine yielding from sub-generator inside."""
# do stuff... first time when 'next' gets called
print("receive job, build generator & yield, loop")
while True:
try:
func, seq = yield "send me work ... or I quit with next next()"
except TypeError:
return "no job left"
else:
yield from (func(*args) for args in seq)
def do_job(gen, job):
"""Run all tasks in job."""
print(gen.send(job))
while True:
result = next(gen)
print(result)
if result == "send me work ... or I quit with next next()":
break
Now we run gen_coroutine
with our helper function do_job
and two jobs.
gen_co = gen_coroutine()
next(gen_co) # start
# receive job, build generator & yield, loop <-- printed with first next()
# Out:'send me work ... or I quit with next next()'
do_job(gen_co, job1) # prints out all results from job
# 1
# 1.4142135623730951
# 1.7320508075688772
# 2.0
# send me work... or I quit with next next()
do_job(gen_co, job2) # send another job into generator
# 3
# 4
# 5
# 6
# send me work... or I quit with next next()
next(gen_co)
# Traceback ...
# StopIteration: no job left
To come back to your question which version is the better approach in general.
IMO something like gen_factory
makes only sense if you need the same thing done for multiple generators you are going to create, or in cases your construction process for generators is complicated enough to justify use of a factory instead of building individual generators in place with a generator comprehension.
Note:
The description above for the gen_generator
function (second mymap
) states
"it is a generator itself". That is a bit vague and technically not
really correct, but facilitates reasoning about the differences of the functions
in this tricky setup where gen_factory
also returns a generator, namely that
one built by the generator comprehension inside.
In fact any function (not only those from this question with generator comprehensions inside!) with a yield
inside, upon invocation, just
returns a generator object which gets constructed out of the function body.
type(gen_coroutine) # function
gen_co = gen_coroutine(); type(gen_co) # generator
So the whole action we observed above for gen_generator
and gen_coroutine
takes place within these generator objects, functions with yield
inside have spit out before.
Return or yield from a function that calls a generator?
Generators are lazy-evaluating so return
or yield
will behave differently when you're debugging your code or if an exception is thrown.
With return
any exception that happens in your generator
won't know anything about generate_all
, that's because when generator
is really executed you have already left the generate_all
function. With yield
in there it will have generate_all
in the traceback.
def generator(some_list):
for i in some_list:
raise Exception('exception happened :-)')
yield i
def generate_all():
some_list = [1,2,3]
return generator(some_list)
for item in generate_all():
...
Exception Traceback (most recent call last)
<ipython-input-3-b19085eab3e1> in <module>
8 return generator(some_list)
9
---> 10 for item in generate_all():
11 ...
<ipython-input-3-b19085eab3e1> in generator(some_list)
1 def generator(some_list):
2 for i in some_list:
----> 3 raise Exception('exception happened :-)')
4 yield i
5
Exception: exception happened :-)
And if it's using yield from
:
def generate_all():
some_list = [1,2,3]
yield from generator(some_list)
for item in generate_all():
...
Exception Traceback (most recent call last)
<ipython-input-4-be322887df35> in <module>
8 yield from generator(some_list)
9
---> 10 for item in generate_all():
11 ...
<ipython-input-4-be322887df35> in generate_all()
6 def generate_all():
7 some_list = [1,2,3]
----> 8 yield from generator(some_list)
9
10 for item in generate_all():
<ipython-input-4-be322887df35> in generator(some_list)
1 def generator(some_list):
2 for i in some_list:
----> 3 raise Exception('exception happened :-)')
4 yield i
5
Exception: exception happened :-)
However this comes at the cost of performance. The additional generator layer does have some overhead. So return
will be generally a bit faster than yield from ...
(or for item in ...: yield item
). In most cases this won't matter much, because whatever you do in the generator typically dominates the run-time so that the additional layer won't be noticeable.
However yield
has some additional advantages: You aren't restricted to a single iterable, you can also easily yield additional items:
def generator(some_list):
for i in some_list:
yield i
def generate_all():
some_list = [1,2,3]
yield 'start'
yield from generator(some_list)
yield 'end'
for item in generate_all():
print(item)
start
1
2
3
end
In your case the operations are quite simple and I don't know if it's even necessary to create multiple functions for this, one could easily just use the built-in map
or a generator expression instead:
map(do_something, get_the_list()) # map
(do_something(i) for i in get_the_list()) # generator expression
Both should be identical (except for some differences when exceptions happen) to use. And if they need a more descriptive name, then you could still wrap them in one function.
There are multiple helpers that wrap very common operations on iterables built-in and further ones can be found in the built-in itertools
module. In such simple cases I would simply resort to these and only for non-trivial cases write your own generators.
But I assume your real code is more complicated so that may not be applicable but I thought it wouldn't be a complete answer without mentioning alternatives.
Generator with return statement
The presence of yield
in a function body turns it into a generator function instead of a normal function. And in a generator function, using return
is a way of saying "The generator has ended, there are no more elements." By having the first statement of a generator method be return str_in
, you are guaranteed to have a generator that returns no elements.
As a comment mentions, the return value is used as an argument to the StopIteration
exception that gets raised when the generator has ended. See:
>>> gen = simple_gen_function("hello", "foo")
>>> next(gen)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration: hello
If there's a yield
anywhere in your def
, it's a generator!
In the comments, the asker mentions they thought the function turned into a generator dynamically, when the yield statement is executed. But this is not how it works! The decision is made before the code is ever excuted. If Python finds a yield
anywhere at all under your def
, it turns that def
into a generator function.
See this ultra-condensed example:
>>> def foo():
... if False:
... yield "bar"
... return "baz"
>>> foo()
<generator object foo at ...>
>>> # The return value "baz" is only exposed via StopIteration
>>> # You probably shouldn't use this behavior.
>>> next(foo())
Traceback (most recent call last):
...
StopIteration: baz
>>> # Nothing is ever yielded from the generator, so it generates no values.
>>> list(foo())
[]
Return and yield in the same function
Yes, it' still a generator. The return
is (almost) equivalent to raising StopIteration
.
PEP 255 spells it out:
Specification: Return
A generator function can also contain return statements of the form:
"return"
Note that an expression_list is not allowed on return statements in
the body of a generator (although, of course, they may appear in the
bodies of non-generator functions nested within the generator).When a return statement is encountered, control proceeds as in any
function return, executing the appropriate finally clauses (if any
exist). Then a StopIteration exception is raised, signalling that the
iterator is exhausted. A StopIteration exception is also raised if
control flows off the end of the generator without an explict return.Note that return means "I'm done, and have nothing interesting to
return", for both generator functions and non-generator functions.Note that return isn't always equivalent to raising StopIteration:
the difference lies in how enclosing try/except constructs are
treated. For example,>>> def f1():
... try:
... return
... except:
... yield 1
>>> print list(f1())
[]
because, as in any function, return simply exits, but
>>> def f2():
... try:
... raise StopIteration
... except:
... yield 42
>>> print list(f2())
[42]
because StopIteration is captured by a bare "except", as is any
exception.
Related Topics
Index N Dimensional Array with (N-1) D Array
What Can You Use Generator Functions For
Postgresql: How to Install Plpythonu Extension
Does Anybody Know How to Identify Shadow Dom Web Elements Using Selenium Webdriver
Convert HTML Entities to Unicode and Vice Versa
Threading in a Pyqt Application: Use Qt Threads or Python Threads
Too Many Values to Unpack', Iterating Over a Dict. Key=>String, Value=>List
Appending the Same String to a List of Strings in Python
The Correct Cmakelists.Txt File to Call a Maxon Libarary in a Python Script Using Pybind11
Sending Messages with Telegram - APIs or Cli
What Does a Python Process Return Code -9 Mean
Display Image as Grayscale Using Matplotlib