Hello all! I would like to address the latest controversy openly stated by Amber Brown.
Specifically, it is a popular opinion in the community that packages are being included to the Standard Library to decay.
The common practice is to include a package in the standard library or initially introduce it there and then support it there syncing its release cycle with CPython.
It looks like it could be beneficial to change that and state the policy that standard library is just a snapshot of selected external packages which keep evolving independently and could be upgraded to a more recent version separately.
This will allow keeping a healthy concurrent space on PyPI as well as installing a minimal core Python setup having only packages that are really needed by an application.
There is a lot of work to be done and a lot of decisions to be made to achieve that clearly. However, it looks like the Python community demands it and will benefit from this. Maybe, it is a nice thing to target for Python 4.0;)
Disclaimer: this is a thought, not a strong opinion I have. I would like to hear what are other opinions on this and where Steering Council could drive the community regarding this topic.
Historically it is is wrong. Many packages have improved in the standard library along the years. I can immediately mention three I’ve been involved with: pickle, multiprocessing and ssl.
However, this opinion can also be a self-fulfilling prophecy. If the opinion becomes widespread enough, then people start believing it’s not possible to improve stdlib modules, therefore they don’t even try doing it.
I see the similarity there, thanks!
Nevertheless, it is worth it to bring this up once more. I think a policy it self and a long term plan to do something like this is crucial to a successful Python evolution nowadays.
I agree with @pitrou. It is true that packages that get included in the stdlib become locked into the stdlib release cycle, and that isn’t always appropriate for a project. But if stdlib code is being “left to die”, then it’s because the original maintainers of that code have stopped working on it - and code in the stdlib is safer in that situation, because the python core devs will take over maintenance. How could that possibly be worse than an unmaintained PyPI package?
Of course, one way it could be worse is if the maintainers stopped supporting the package because it was added to the stdlib. But that’s where the idea of “packages go to the stdlib to die” is not merely wrong, but actively harmful, if it’s encouraging people to drop support for their projects. But we don’t propose packages get added to the stdlib without the support of the maintainer - so I can’t see how that would happen unless the maintainer viewed moving to the stdlib as a way for them to dump support responsibility on someone else (the core devs), which seems pretty irresponsible.
Well, there are examples on the other side: datetime, json, dataclasses, even asyncio.
In my opinion there should be a distinction between basic functionality and tools around it. The standard library should provide a framework for tools. Why json is in stdlib but not yaml and/or toml? It would be nice to have protocols defined in stdlib like db_api, serialization_api, concurrency_api, async_api. Then third-party packages could use it. Say multiprocessing and threading use concurrency_api, json and toml use serialization_api. Something like that.
I see how there are a lot of historical reasons behind a lot of things in stdlib. I just think that there should exist a kind of a strategic plan for the Python development.
Usually, there is an alternative or fork on PyPI. When something is left unmaintained on PyPI it just a matter of using an alternative. When something is unmaintained and locked in Python it just becomes a ballast.
I’ve also mentioned an option to continue to ship a package with Python but keep it’s development cycle separated.
Think of it like a of a Linux distribution, i.e. Fedora doesn’t fork Gnome and the later is just being shipped with the former. Each major Python distribution could include major updates of stdlib packages and keep updating them to minor versions in minor Python updates.
That’s definitely something that’s been covered endlessly in other threads. I’m not going to repeat my objections here, but any such proposal would need to do the background research and address the questions raised in the past.
any such proposal would need to do the background research and address the questions raised in the past
I’d like to emphasize one of my main points here. I strongly believe this is the work the Steering Council should be doing. It is clear there are different opinions and suggestions out there. It is clear that there are some struggles with the Python Standard Library in the community. So, I just would like to see this strategic work being handled by the Steering Council and we see the progress on these kinds of topics and some results ideally.
These are examples of something that have better alternatives and wouldn’t be widely adopted in the community if they are not in the Standard Library.
For example, I know a bug in datetime (I do and am going to file it soon). I would like to fix it and send MR. I know what to expect with a project on PyPI/Github. However, I have no idea (hypothetically and often seen in real life) where to file a bug in Python’s standard module. Moreover, even if my MR is accepted it unclear when I see the result in upstream and would that version of Python distribution will be compatible with my application.
I just want that fixed version of datetime to exist on PyPI to upgrade to it only.
Also, package hosted on PyPI could ensure that it supports all popular Python implementations and behave identically for every one of them. This is not the case for every stdlib module but could be for many of them.
If you think there are better alternatives on PyPI then feel free to use them. But you don’t seem to be proposing anything reasonable and concrete here, sorry.
My key point is that I think the links provided by @encukou are where the discussion are and best focused. While I appreciate the enthusiasm, @lig, I don’t’ think the other topics are a better place to participate. Plus there’s a lot of background to read up on in order to effectively participate in the conversation.
This tone is a bit aggressive. It’s an opinion that some hold, but it isn’t universally held so I don’t think the community is “demanding” anything. It’s a contentious issue because there multiple, valid viewpoints with no clear winner for the vast majority of users.
datetime predates a lot of things so I’m not sure what packages that predate its existence you think it is overshadowing
json was actually brought into the stdlib by the creator of simplejson and other alternative implementations are newer than json
dataclasses is the closest you come to something coming in which shadows a public module which preexisted
asyncio exists to help standardize the APIs and to provide an implementation. The latter part probably could have been left out in the end, but Guido disagreed about removing it due to the risk of breaking code that relied on it being in the stdlib (which is a valid reason to keep something and thus why there’s a thread about how to make removal easier).
Because someone contributed json, no one has done the work for trying to contribute yaml, and toml is not at version 1 yet.
I’d like to point out that those doubts are a matter of process and habit. Developers know what to expect about projects on GitHub probably because they have contributed to a bunch of them in the past. They probably do read through a project’s “CONTRIBUTING” file or something like that but they may have read enough of them to know what to expect.
Similarly, Python and its stdlib also have a documented process they can look and learn how to do all of those things. Maybe we could improve this process. Maybe there are ways to make contributors feel more welcome. The recent migration to GitHub probably will help with that