Converting "Yield From" Statement to Python 2.7 Code

Converting yield from statement to Python 2.7 code

If you don't use the results of your yields,* you can always turn this:

yield from foo

… into this:

for bar in foo:
    yield bar

There might be a performance cost,** but there is never a semantic difference.

Are the entries from the two halves (upper and lower) appended to each other in one list so that the parent recursion function with the yield from call and use both the halves together?

No! The whole point of iterators and generators is that you don't build actual lists and append them together.

But the effect is similar: you just yield from one, then yield from another.

If you think of the upper half and the lower half as "lazy lists", then yes, you can think of this as a "lazy append" that creates a larger "lazy list". And if you call list on the result of the parent function, you of course will get an actual list that's equivalent to appending together the two lists you would have gotten if you'd done yield list(…) instead of yield from ….

But I think it's easier to think of it the other way around: What it does is exactly the same the for loops do.

If you saved the two iterators into variables, and looped over itertools.chain(upper, lower), that would be the same as looping over the first and then looping over the second, right? No difference here. In fact, you could implement chain as just:

for arg in *args:
    yield from arg

* Not the values the generator yields to its caller, the value of the yield expressions themselves, within the generator (which come from the caller using the send method), as described in PEP 342. You're not using these in your examples. And I'm willing to bet you're not in your real code. But coroutine-style code often uses the value of a yield from expression—see PEP 3156 for examples. Such code usually depends on other features of Python 3.3 generators—in particular, the new StopIteration.value from the same PEP 380 that introduced yield from—so it will have to be rewritten. But if not, you can use the PEP also shows you the complete horrid messy equivalent, and you can of course pare down the parts you don't care about. And if you don't use the value of the expression, it pares down to the two lines above.

** Not a huge one, and there's nothing you can do about it short of using Python 3.3 or completely restructuring your code. It's exactly the same case as translating list comprehensions to Python 1.5 loops, or any other case when there's a new optimization in version X.Y and you need to use an older version.

yield from' substitute in Python 2

You still need to loop. It doesn't matter that you have recursion here.

You need to loop over the generator produced by the recursive call and yield the results:

def foo(obj):
    for ele in obj:
        if isinstance(ele, list):
            for res in foo(ele):
                yield res
        else:
            yield ele

Your recursive call produces a generator, and you need to pass the results of the generator onwards. You do so by looping over the generator and yielding the individual values.

There are no better options, other than upgrading to Python 3.

yield from essentially passes on the responsibility to loop over to the caller, and passes back any generator.send() and generator.throw() calls to the delegated generator. You don't have any need to pass on .send() or .throw(), so what remains is taking responsibility to do the looping yourself.

Demo:

>>> import sys
>>> sys.version_info
sys.version_info(major=2, minor=7, micro=14, releaselevel='final', serial=0)
>>> def foo(obj):
...     for ele in obj:
...         if isinstance(ele, list):
...             for res in foo(ele):
...                 yield res
...         else:
...             yield ele
...
>>> l = [1, [2, 3, [4,5]]]
>>> list(foo(l))
[1, 2, 3, 4, 5]

yield from was introduced in PEP 380 -- Syntax for Delegating to a Subgenerator (not PEP 342), specifically because a loop over the sub-generator would not delegate generator.throw() and generator.send() information.

The PEP explicitly states:

If yielding of values is the only concern, this can be performed without much difficulty using a loop such as
for v in g:
    yield v

The Formal Semantics has a Python implementation equivalent that may look intimidating at first, but you can still pick out that it loops (with while 1:, looping ends when there is an exception or StopIteration is handled, new values are retrieved with next() or generator.send(..)), and yields the results (with yield _y).

How to convert version 3.x yield from to something compatible in version 2.7?

Convert yield from into a for-loop with plain yield.

Convert class Node: into class Node(object): to ensure you get a new-style class.

The code now works in Python 2.7.

class Node(object):
 def __init__(self, value):
    self._value = value
    self._children = []

 def __repr__(self):
    return 'Node({!r})'.format(self._value)

 def add_child(self, node):
    self._children.append(node)

 def __iter__(self):
    return iter(self._children)

 def depth_first(self):
    yield self
    for c in self:
        for n in c.depth_first():
            yield n

# Example
if __name__ == '__main__':
    root = Node(0)
    child1 = Node(1)
    child2 = Node(2)
    root.add_child(child1)
    root.add_child(child2)
    child1.add_child(Node(3))
    child1.add_child(Node(4))
    child2.add_child(Node(5))
    for ch in root.depth_first():
        print(ch)

What does yield do in python 2.7?

OK, you know about generators, so the yield part needs no explanation. Fine.

So what does that line actually do? Not very much:

It concatenates padding_zeros and number_string and then encodes the result to ASCII. Which in Python 2.7 is a no-op because the string is ASCII to begin with (it only consists of ASCII digits, by definition).

In Python 3, it would be different; here the .encode() would have converted the string to a bytes object. But in Python 2, it doesn't make any sense.

python 2.7 - is there a more succint way to do this series of yield statements (in python 3, yield from would help)

If as you say you can get away with returning None, then I'd leave the code as it was in the first place:

def maybe(x):
    """ only keep odd value; returns either element or None """
    result = 11 * x
    if result & 1: return result

def do_stuff():
    yield maybe(1)
    yield maybe(6)
    yield maybe(5)

but use a wrapped version instead which tosses the Nones, like:

def do_stuff_use():
    return (x for x in do_stuff() if x is not None)

You could even wrap the whole thing up in a decorator, if you wanted:

import functools

def yield_not_None(f):
    @functools.wraps(f)
    def wrapper(*args, **kwargs):
        return (x for x in f(*args, **kwargs) if x is not None)
    return wrapper

@yield_not_None
def do_stuff():
    yield maybe(1)
    yield maybe(6)
    yield maybe(5)

after which

>>> list(do_stuff())
[11, 55]

What does the yield keyword do?

To understand what yield does, you must understand what generators are. And before you can understand generators, you must understand iterables.

Iterables

When you create a list, you can read its items one by one. Reading its items one by one is called iteration:

>>> mylist = [1, 2, 3]
>>> for i in mylist:
...    print(i)
1
2
3

mylist is an iterable. When you use a list comprehension, you create a list, and so an iterable:

>>> mylist = [x*x for x in range(3)]
>>> for i in mylist:
...    print(i)
0
1
4

Everything you can use "for... in..." on is an iterable; lists, strings, files...

These iterables are handy because you can read them as much as you wish, but you store all the values in memory and this is not always what you want when you have a lot of values.

Generators

Generators are iterators, a kind of iterable you can only iterate over once. Generators do not store all the values in memory, they generate the values on the fly:

>>> mygenerator = (x*x for x in range(3))
>>> for i in mygenerator:
...    print(i)
0
1
4

It is just the same except you used () instead of []. BUT, you cannot perform for i in mygenerator a second time since generators can only be used once: they calculate 0, then forget about it and calculate 1, and end calculating 4, one by one.

Yield

yield is a keyword that is used like return, except the function will return a generator.

>>> def create_generator():
...    mylist = range(3)
...    for i in mylist:
...        yield i*i
...
>>> mygenerator = create_generator() # create a generator
>>> print(mygenerator) # mygenerator is an object!
<generator object create_generator at 0xb7555c34>
>>> for i in mygenerator:
...     print(i)
0
1
4

Here it's a useless example, but it's handy when you know your function will return a huge set of values that you will only need to read once.

To master yield, you must understand that when you call the function, the code you have written in the function body does not run. The function only returns the generator object, this is a bit tricky.

Then, your code will continue from where it left off each time for uses the generator.

Now the hard part:

The first time the for calls the generator object created from your function, it will run the code in your function from the beginning until it hits yield, then it'll return the first value of the loop. Then, each subsequent call will run another iteration of the loop you have written in the function and return the next value. This will continue until the generator is considered empty, which happens when the function runs without hitting yield. That can be because the loop has come to an end, or because you no longer satisfy an "if/else".

Your code explained

Generator:

# Here you create the method of the node object that will return the generator
def _get_child_candidates(self, distance, min_dist, max_dist):

    # Here is the code that will be called each time you use the generator object:

    # If there is still a child of the node object on its left
    # AND if the distance is ok, return the next child
    if self._leftchild and distance - max_dist < self._median:
        yield self._leftchild

    # If there is still a child of the node object on its right
    # AND if the distance is ok, return the next child
    if self._rightchild and distance + max_dist >= self._median:
        yield self._rightchild

    # If the function arrives here, the generator will be considered empty
    # there are no more than two values: the left and the right children

Caller:

# Create an empty list and a list with the current object reference
result, candidates = list(), [self]

# Loop on candidates (they contain only one element at the beginning)
while candidates:

    # Get the last candidate and remove it from the list
    node = candidates.pop()

    # Get the distance between obj and the candidate
    distance = node._get_dist(obj)

    # If the distance is ok, then you can fill in the result
    if distance <= max_dist and distance >= min_dist:
        result.extend(node._values)

    # Add the children of the candidate to the candidate's list
    # so the loop will keep running until it has looked
    # at all the children of the children of the children, etc. of the candidate
    candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))

return result

This code contains several smart parts:

The loop iterates on a list, but the list expands while the loop is being iterated. It's a concise way to go through all these nested data even if it's a bit dangerous since you can end up with an infinite loop. In this case, candidates.extend(node._get_child_candidates(distance, min_dist, max_dist)) exhausts all the values of the generator, but while keeps creating new generator objects which will produce different values from the previous ones since it's not applied on the same node.
The extend() method is a list object method that expects an iterable and adds its values to the list.

Usually, we pass a list to it:

>>> a = [1, 2]
>>> b = [3, 4]
>>> a.extend(b)
>>> print(a)
[1, 2, 3, 4]

But in your code, it gets a generator, which is good because:

You don't need to read the values twice.
You may have a lot of children and you don't want them all stored in memory.

And it works because Python does not care if the argument of a method is a list or not. Python expects iterables so it will work with strings, lists, tuples, and generators! This is called duck typing and is one of the reasons why Python is so cool. But this is another story, for another question...

You can stop here, or read a little bit to see an advanced use of a generator:

Controlling a generator exhaustion

>>> class Bank(): # Let's create a bank, building ATMs
...    crisis = False
...    def create_atm(self):
...        while not self.crisis:
...            yield "$100"
>>> hsbc = Bank() # When everything's ok the ATM gives you as much as you want
>>> corner_street_atm = hsbc.create_atm()
>>> print(corner_street_atm.next())
$100
>>> print(corner_street_atm.next())
$100
>>> print([corner_street_atm.next() for cash in range(5)])
['$100', '$100', '$100', '$100', '$100']
>>> hsbc.crisis = True # Crisis is coming, no more money!
>>> print(corner_street_atm.next())
<type 'exceptions.StopIteration'>
>>> wall_street_atm = hsbc.create_atm() # It's even true for new ATMs
>>> print(wall_street_atm.next())
<type 'exceptions.StopIteration'>
>>> hsbc.crisis = False # The trouble is, even post-crisis the ATM remains empty
>>> print(corner_street_atm.next())
<type 'exceptions.StopIteration'>
>>> brand_new_atm = hsbc.create_atm() # Build a new one to get back in business
>>> for cash in brand_new_atm:
...    print cash
$100
$100
$100
$100
$100
$100
$100
$100
$100
...

Note: For Python 3, useprint(corner_street_atm.__next__()) or print(next(corner_street_atm))

It can be useful for various things like controlling access to a resource.

Itertools, your best friend

The itertools module contains special functions to manipulate iterables. Ever wish to duplicate a generator?
Chain two generators? Group values in a nested list with a one-liner? Map / Zip without creating another list?

Then just import itertools.

An example? Let's see the possible orders of arrival for a four-horse race:

>>> horses = [1, 2, 3, 4]
>>> races = itertools.permutations(horses)
>>> print(races)
<itertools.permutations object at 0xb754f1dc>
>>> print(list(itertools.permutations(horses)))
[(1, 2, 3, 4),
 (1, 2, 4, 3),
 (1, 3, 2, 4),
 (1, 3, 4, 2),
 (1, 4, 2, 3),
 (1, 4, 3, 2),
 (2, 1, 3, 4),
 (2, 1, 4, 3),
 (2, 3, 1, 4),
 (2, 3, 4, 1),
 (2, 4, 1, 3),
 (2, 4, 3, 1),
 (3, 1, 2, 4),
 (3, 1, 4, 2),
 (3, 2, 1, 4),
 (3, 2, 4, 1),
 (3, 4, 1, 2),
 (3, 4, 2, 1),
 (4, 1, 2, 3),
 (4, 1, 3, 2),
 (4, 2, 1, 3),
 (4, 2, 3, 1),
 (4, 3, 1, 2),
 (4, 3, 2, 1)]

Understanding the inner mechanisms of iteration

Iteration is a process implying iterables (implementing the __iter__() method) and iterators (implementing the __next__() method).
Iterables are any objects you can get an iterator from. Iterators are objects that let you iterate on iterables.

There is more about it in this article about how for loops work.

Is there any shorthand for 'yield all the output from a generator'?

In Python 3.3+, you can use yield from. For example,

>>> def get_squares():
...     yield from (num ** 2 for num in range(10))
... 
>>> list(get_squares())
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

It can actually be used with any iterable. For example,

>>> def get_numbers():
...     yield from range(10)
... 
>>> list(get_numbers())
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> def get_squares():
...     yield from [num ** 2 for num in range(10)]
... 
>>> list(get_squares())
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Unfortunately, Python 2.7 has no equivalent construct :'(

Converting Ruby loop iteration with a yield statement to Python

Then there are the following two functions as part of the Grid class.

These aren't functions. They are methods.

def each_row
  @grid.each do |row|
    yield row
  end
end

def each_cell
  each_row do |row|
    row.each do |cell|
      yield cell if cell
    end
  end
end

What are the last two functions here actually doing?

The each_row method takes a block as parameter and will successively yield all elements of the @grid array. @grid is structured as an array of arrays, representing rows of cells. In other words, each_row will successively yield each row of the grid, i.e. it is an iterator method for rows.

The each_cell method takes a block as parameter and will successively yield all elements of the row arrays in the grid @grid array. In other words, each_cell will successively yield each cell of the grid if it exists, i.e. it is an iterator method for cells.

The literal translation to Python would be something like this (untested):

def each_row(self, f):
  self.grid.each(lambda row: f(row))

def each_cell(self, f):
  self.each_row(lambda row: lambda cell: if cell: f(cell))

But, it just doesn't make sense to translate code from one language to another this way. Using lambdas for iteration in Python is non-idiomatic. Python uses iterators for iteration. So, instead of having each_row and each_cell iterator methods, you would rather have row_iterator and cell_iterator getters which return iterator objects for the rows and cells, so that you can then do something like:

for cell in grid.cell_iterator

instead of

grid.each_cell(lambda cell: …)

Something like this (also untested):

def row_iterator(self):
  for row in self.grid: yield row

def cell_iterator(self):
  for row in self.row_iterator:
    for cell in row:
      if cell: yield cell

When you "translate" code from one language to another, you cannot just translate it line-by-line, statement-by-statement, expression-by-expression, subroutine-by-subroutine, class-by-class, etc. You need to re-design it from the ground up, using the patterns, practices, and idioms of the community and the types, classes, subroutines, modules, etc. from the language and its core and standard libraries.

Otherwise, you could just use a compiler. A compiler is literally defined as "a program that translates a program from one language to another language". If that is all you want to do, use a compiler. But, if you want the translated code to be readable, understandable, maintainable, and idiomatic, it can only be done by a human.

Converting "Yield From" Statement to Python 2.7 Code