Richard Jones' Log: I take back all I said about __iter__ators

Tue, 15 Feb 2005

Some time ago, I questioned the wisdom of adding the new iterator protocol to Python. Now I'm wiser, and I understand :)

Roundup 0.8 includes per-item access controls - for example, you can specify that users may only view / edit certain issues, or perhaps certain messages attached to issues. The HTML templating system now automatically filters out inaccessible items from listings. In one situation, it does so using an iterator:

class MultilinkIterator:
    def __init__(self, classname, client, values):
        self.classname = classname
        self.client = client
        self.values = values
        self.id = -1
    def next(self):
        '''Return the next item, but skip inaccessible items.'''
        check = self.client.db.security.hasPermission
        userid = self.client.userid
        while 1:
            self.id += 1
            if self.id >= len(self.values):
                raise StopIteration
            value = self.values[self.id]
            if check('View', userid, self.classname, itemid=value):
                return HTMLItem(self.client, self.classname, value)
    def __iter__(self):
        return self

Doing this using an old-style __getitem__ "iterator" would be much more difficult and messy.

Update: inspired by Bob's comment, I re-wrote it as a generator (my second ever ;)

def multilinkGenerator(classname, client, values):
    id = -1
    check = client.db.security.hasPermission
    userid = client.userid
    while 1:
        id += 1
        if id >= len(values):
            raise StopIteration
        value = values[id]
        if check('View', userid, classname, itemid=value):
            yield HTMLItem(client, classname, value)

I'm going to have so much fun playing with Python 2.3+ stuff* :)

*: Roundup's been holding me back ... until today's 0.8 release the minimum requirement was Python 2.1.

Comment by Bob Ippolito on Wed, 16 Feb 2005

And doing it with a generator is even easier :)

Comment by toby on Wed, 16 Feb 2005

def genfilter(f, s):
  i = iter(s)
  try:
    while 1:
      o = i.next()
      if f(o): yield o
  except StopIteration:
     return

genfilter(lambda x: client.db.security.hasPermission('View', userid, classname, itemid=x), values)
IMO, this is the way builtins like filter and map should work in any case.

Comment by toby on Wed, 16 Feb 2005

... at least when handed generators/iterators to work on...

Comment by Steffen on Wed, 16 Feb 2005

I am new to python and wonder whether the use of generators is not only easier but nicer since it does not require wading through all the data up to the matching point on every call of next()?

Comment by Bob Ippolito on Wed, 16 Feb 2005

toby - "genfilter" is called itertools.ifilter :) Here is the generator implementation I would write:

def multilinkGenerator(classname, client, values):
    check = client.db.security.hasPermission
    userid = client.userid
    for value in values:
        if check('View', userid, classname, itemid=value):
            yield HTMLItem(client, classname, value)

Comment by Richard on Wed, 16 Feb 2005

Ah, thanks Bob. I've still a way to go before I'm truly in the head-space of 2.3 :)

Comment by Fredrik on Thu, 17 Feb 2005

"Doing this using an old-style __getitem__ "iterator" would be much more difficult and messy"

Umm. Get rid of __iter__, add a dummy argument to next and rename it to __getitem__, and replace StopIteration with IndexError, and you get an old-school forward-only iterator in no time at all...

Comment by Fredrik on Thu, 17 Feb 2005

btw, raising StopIteration in a generator is a waste of effort -- just make sure you reach the end of the generator function, and Python will take care of the rest.

(in this case, this means that you can replace the while/test/access stuff with a plain for loop)

Comment by Fredrik on Thu, 17 Feb 2005

"it does not require wading through all the data up to the matching point on every call of next()?"

steffen, that's why the first example uses an instance variable to keep track of the current position.

(in Richard's generator example, he's using a local variable instead. and in Bob's simplified example, the position counter is hidden somewhere inside the for-loop machinery).

Comment by Matthew Good on Thu, 17 Feb 2005

Yes, itertools is your friend. You can try something like this:

from itertools import imap, ifilter

def multilink(clss, client, values):
    check = lambda v: client.db.security.hasPermission('View', client.userid, clss, itemid=v)
    html = lambda v: HTMLItem(client, clss, v)
    return imap(html, ifilter(check, values))
Also, check out the Xoltar toolkit for some nice functions for doing functional-style programming. The curry implementation is quite useful (and could be used instead of those lambdas above). Check out the links to the IBM developerWorks articles mentioned on that page for some good tips on using it.