Mon, 21 Jul 2003
PyPI has had some much-needed attention

The Python Package Index (PyPI) uses the very cool sqlite engine to store the database of package information. Sqlite is cool because it's so simple to use and self-contained. Unfortunately, it's not a multi-user database. This means it locks the database when anyone accesses it. This caused PyPI some problems because ... well, PyPI is much more popular than I'd anticipated :)

After some brief analysis, I found:

  1. The RSS feed gets hit about every 30 seconds or so (on average)
  2. Some other PyPI page is hit at around the same frequency
  3. About every third of those other hits is to the browse code, and the browse code was slow - taking up to 30 seconds to complete a request

Of course, this is all using averages, so during times of peak requests (ie. lunchtime in the US ;) then the rates are higher. And the combination of many requests and slow code result in users seeing "sorry, the database is locked".

To remedy this, I've:

  1. Cached the RSS feed, so it only rarely has to hit the database
  2. Significantly improved the speed (and accuracy while I was at it) of the browsing code

So hopefully things will run much smoother. Please, go kick the tyres and let me know if I've broken anything :)

