Richard Jones' Log: Python's Deficient Generators

Mon, 21 Dec 2009

I've had people try to explain why Python's generators are deficient before and the light just went off. To flatten an arbitrarily deep (though stack-limited) nested iterator structure, I wrote:

>>> def flatten(iterable):
...  for elem in iterable:
...   if hasattr(elem, '__iter__'):
...    for sub in flatten(elem): yield sub
...   else:
...    yield elem
... 
>>> list(flatten([[1,2,3],[[2,3],4],[4,5,6]]))
[1, 2, 3, 2, 3, 4, 4, 5, 6]

The key failing is that inner "for" loop. It's called "trampolining" as described by David Beazley. If I could somehow yield to the top of the generator stack it'd be unnecessary. It'd look something like PEP 380:

>>> def flatten(iterable):
...  for elem in iterable:
...   if hasattr(elem, '__iter__'):
...    yield from flatten(elem)
...   else:
...    yield elem
... 
>>> list(flatten([[1,2,3],[[2,3],4],[4,5,6]]))
[1, 2, 3, 2, 3, 4, 4, 5, 6]

Too bad there's a moratorium on changes :)

Updated to include the PEP, thanks ulrik!

Comment by ulrik on Tue, 22 Dec 2009

The PEP for delegating to subgenerators is in PEP 380, and the syntax is 'yield from flatten(..)'

http://www.python.org/dev/peps/pep-0380/

Comment by Peter Shinners on Tue, 22 Dec 2009

My generators are always calling through to other generators. I always feel dirty with "inner yield loops", but never felt the feature was broken. I would love it to get fixed though.

Comment by Brian Harring on Tue, 22 Dec 2009

The solution I use for when I expect the generator to go fairly deep is to shift the recursion into a stack of iterables. The specifics of it are at the blog that kicked this off. As mentioned there, last time I did the profiling of it, it didn't seem cost effective most of the time to do the stack approach unless you had extensions in use- that said, it's a nice trick if you've got the equivalent of expandable_chain laying around in your project.

Comment by Eric O. LEBIGOT (EOL) on Tue, 22 Dec 2009

Interesting! Thank you for sharing!