Richard Jones' Log: PyPI Categories
What could be done to make the PyPI categories better? An argument has been put forward that the current PyPI categories are useless. I happen to believe otherwise. Historically, the full list derives from a combination of the lists of sourceforge and freshmeat. I'm willing to concede that they're not perfect. Witness the multiple X11 window managers named when a single "window managers" category would do. Even given that, I believe they're a damn good start.
Some PyPI statistics about occurence of classifiers:
Number of packages | % of total | % of those with classifiers | |
---|---|---|---|
All | 493 (wow!) | ||
Any classifiers | 373 | 75% | |
Development Status | 349 | 70% | 93% |
Topic | 312 | 63% | 83% |
License | 312 | 62% | 82% |
Programming Language | 297 (hmm ;) | 60% | 79% |
Intended Audience | 282 | 57% | 75% |
Environment | 178 | 36% | 47% |
Natural Language | 134 | 27% | 35% |
I reckon that's pretty good...
If you have any suggestions about potential changes to the categories, I'd be happy to hear them. I'd prefer that the suggestions be made on the catalog SIG mailing list though, or perhaps comments on this log entry, rather than directly to me, please ;) Questions to answer:
- What's missing?
- If the classifiers aren't missing, then what's the impediment to using them?
- Should PyPI require specification of at least one of each of the top-level categories?
- Somewhat related: Is the PyPI browse interface at all useful?
OTOH, wow! Almost 500 packages indexed in about 6 months! :)
There's no reason why the distutils register command couldn't convert ['Category', 'Subcategory', 'Subsubcategory'] into 'Category :: Subcategory :: Subsubcategory' before submission.
I guess I kinda assumed people would cut-n-paste from the list (which is why I added the --list-classifiers) argument.
My main problem with the categories is that they're a pain in the arse to enter and get right.
I'd much rather see something like
['Category', 'Subcategory', 'Subsubcategory' ] than the current magic string format.