Richard Jones' Log: PyPI Categories

Fri, 18 Jun 2004

What could be done to make the PyPI categories better? An argument has been put forward that the current PyPI categories are useless. I happen to believe otherwise. Historically, the full list derives from a combination of the lists of sourceforge and freshmeat. I'm willing to concede that they're not perfect. Witness the multiple X11 window managers named when a single "window managers" category would do. Even given that, I believe they're a damn good start.

Some PyPI statistics about occurence of classifiers:

 Number of packages% of total% of those with classifiers
All493 (wow!)  
Any classifiers37375% 
Development Status34970%93%
Programming Language297 (hmm ;)60%79%
Intended Audience28257%75%
Natural Language13427%35%

I reckon that's pretty good...

If you have any suggestions about potential changes to the categories, I'd be happy to hear them. I'd prefer that the suggestions be made on the catalog SIG mailing list though, or perhaps comments on this log entry, rather than directly to me, please ;) Questions to answer:

  1. What's missing?
  2. If the classifiers aren't missing, then what's the impediment to using them?
  3. Should PyPI require specification of at least one of each of the top-level categories?
  4. Somewhat related: Is the PyPI browse interface at all useful?

OTOH, wow! Almost 500 packages indexed in about 6 months! :)

Comment by anthony baxter on Wed, 23 Jun 2004

My main problem with the categories is that they're a pain in the arse to enter and get right.

I'd much rather see something like
['Category', 'Subcategory', 'Subsubcategory' ] than the current magic string format.

Comment by Richard on Wed, 23 Jun 2004

There's no reason why the distutils register command couldn't convert ['Category', 'Subcategory', 'Subsubcategory'] into 'Category :: Subcategory :: Subsubcategory' before submission.

I guess I kinda assumed people would cut-n-paste from the list (which is why I added the --list-classifiers) argument.