PEP 594, take 2: Removing dead batteries from the standard library

That wasn’t at all what I suggested. You could make a snapshot of cgi and cgitb , toss them up on GitHub was meant to imply that the interested user could decide to maintain it, make a snapshot of the code and host it as a separate repo on GitHub. I did not intend to imply the code would live on somehow connected to the python/cpython repo. To wit:

My apologies for not searching for the bsddb185 code first. As it turns out that was long before CPython was hosted on GitHub. I doubt I created a repo anywhere, just snagged the code and uploaded it to PyPI.

Maybe there’s some way to fork just the modules of interest from python/cpython. I’m not at all a git wizard. If that’s possible, perhaps part of the dead batteries PEP should show people how to do that so they don’t lose the dead batteries’ histories when they decide to tilt at windmills.

I know this wasn’t the intent, and nobody else who suggested it earlier intended it either. It’s just the practicalities of how it works out. If “we” put it on GitHub, it’ll be attached to someone’s account, and it’s hard to disown at that point. If “we” put it on PyPI, it’ll be attached to someone’s account - same as bsddb185 - and only that person can update it until it gets transferred.

Getting it out of the repo is just git checkout 3.10 (or whichever branch it is last in) and then copying the files. Or browse to an earlier branch in GitHub and download the file directly. It’s really no less obvious than trying to find another repository somewhere else, and it’s far more obvious what you get out of it (i.e. the file, and not somewhere to file issues, or someone to contact about it).

If you personally feel passionate enough about it, then you can be the person who does it. There’s nothing wrong with that, and nobody will stop you. But we decided not to do it “officially” for those reasons.

1 Like

Who do you meant by “we?” I meant the guy who wants to keep cgiand cgitb alive.

I’m not sure I understand the response here. It sounds like @smontanaro was just suggesting to @jmr how to pull out the code out to their own personal repository, publish it themselves to PyPI, and maintain it independently, which it seems @jmr had in fact already done on his own. Looking at the package name and description, both on PyPI and GitHub, prominent mention is made of the fact that it is a fork, and the standard library version is stated to be deprecated and slated for removal. Indeed, per the Wikipedia definition of “fork”:

In software engineering, a project fork happens when developers take a copy of source code from one software package and start independent development on it, creating a distinct and separate piece of software.

Therefore, given the meaning of “fork” is quite clear, I don’t see any real risk of user confusion as to whether the CPython core dev team is responsible for the forked version.

As to

it appears @smontanaro was specifically asking about ways to do so while preserving Git history of the existing project in the new one, which can often be very useful when maintaining the code. Unfortunately, while I am aware of a few possible ways to do it (and have done it on occasion), all the ways I know of involve a lot of time, effort and Git black magic, particularly for a repo as large and long-lived as CPython, so I don’t think its really in scope for the PEP (but could be brought up elsewhere, such as this thread).

You may be right. I noticed it had already been done, which made Skip’s suggestion seem more generic (like the past ones had been) rather than specifically intended for Jack.

1 Like

Sure, git filter-branch is a bit black magic, with scary
warnings in its manpage. The alternative it suggests though, is
actually pretty great and not all that hard to use:
GitHub - newren/git-filter-repo: Quickly rewrite git repository history (filter-branch replacement) (bonus points, it’s
written in Python).


That looks like a great solution, thanks—might come in handy in the future. The last time I had to do this (a couple years+ ago), those warnings weren’t there and git-filter-repo had been effectively unmainatined for many years, so I had to make do with git filter-branch, BFG and manual patching, which were all terribly suited for what I was trying to, which git filter repo is explicitly designed for.

Shameless plug: I’ve updated the test for os.sendfile that also used asyncore (GH-31876), a review is needed.

I message here because that PR was spamming rebase notifications last days while I was hunting down problems on the Ubuntu runner, so I suspect that its undrafting could pass unnoticed.


The commands I used to start GitHub - jackrosenthal/python-cgi: Fork of the standard library cgi and cgitb modules, being deprecated in PEP-594 :

git clone
cd cpython
git filter-repo --force --path Lib/ --path Lib/ --path LICENSE --path Doc/library/cgitb.rst --path Doc/library/cgi.rst
git mv Lib/
git mv Lib/
mkdir docs
git mv Doc/library/cgi.rst docs/cgi.rst
git mv Doc/library/cgitb.rst docs/cgitb.rst
git commit -m "Move files from their cpython paths"

Then the rest was throwing in a pyproject.toml, README.rst, and publishing a package.

So yeah … the git history was preserved :slight_smile:


I am thinking about making a python package that would maintain these dead battery’s as a pip package but have no mention of python , the psf and anything python related and not have any trademarks and make no claim or direction to interact with any psf members . I plan on calling this dead batteries is there anything I haven’t thought of or things I should also do ?

I would strongly urge you consider making the modules you need separate (distribution) packages, one for each module, and preferably separate source repositories (perhaps under a single dead-batteries GH org), perhaps with some form of automation as to the generation and maintenance of the packaging/deployment infrastructure. They are a collection of otherwise mostly-unrelated code with unrelated purposes, and conflating them doesn’t seem to have much benefit aside from a modest reduction in the initial overhead of creating separate repos with boilerplate infra. On the other hand:

  • Any given user is likely to only want/need ≈one specific module, and having to install all of them just to get one is rather undesirable, since not only does consume many times more resources, but also pulls in a substantial amount of old, unmaintained, vulnerable code (much of it security-relevant) in the process.
  • This means that many modules and/or packages will get installed with a single distribution package, which (particularly the former, which is very rare) is uncommon, generally discouraged and can result in unintuitive, unexpected or unintended behaviors when creating, installing and using the package. Furthermore, it means that all of them will be exposed as top-level import packages, instead of only the one the user actually needs.
  • Whenever any of them is updated, you’ll have to release an update with all of them, which is inefficient, leads to higher update churn and can bottleneck improvements getting out to users, since it means you can’t release specific subcomponets separately; furthermore, this leads to the version number becoming less meaningful wrt changes in a specific module
  • There are additional difficulties on the source repo side, as you can’t delegate access to specific maintainers/contributors to be responsible for only specific modules, its more difficult to apply different coding standards, documentation and packaging methods to each one, and you cannot easily drop maintenance of specific modules without removing them from the codebase and distribution
  • Contributors are likely to only be interested in a specific module, so this increases the size and complexity of the codebase, as well as the overall overhead with single-module changes

So as mentioned, my recommendation is pull out the modules you want to actively maintain to separate GitHub/GitLab repos under a common organization with a common boilerplate template, and then release them as separate PyPI (distribution) packages. You can use tools like All-Repos, cookietemple and cruft to easily automate common boilerplate changes, minimizing any extra overhead this incurs past initial creation (which can be done relatively quickly with tools like hub and gh).


I’m agreeing with the bulk of what you said, but wanted to raise this point:

  • This means that many modules and/or packages will get installed with
    a single distribution package, which (particularly the former, which
    is very rare) is uncommon, generally discouraged

I don’t think this specific point is true, and hope it’s not the consensus.

Python distributions have always been allowed to contain combinations of
zero or more modules, zero or more packages (with optional package data
files), zero or more scripts, and let’s ignore data files here :wink:

It’s true that it is very common to have one distribution install one
top-level package, with code neatly organized in sub-modules, and a good
chance to avoid module naming clashes (helped when the package name is
the same as the distribution project name, but even then not
guaranteed), but it should be fine to install more than one packages, or
even a few modules if that’s the organization that makes sense for the
project or the author. I don’t think it should be discouraged (but it’s
good if tutos show the typical thing, and that’s enough IMO).


1 Like

Only thing you really have to make sure to do is to keep the license with the code.

Yeah, I think disparate modules shouldn’t be shipped together, but not restrict the general size of anything (else Django has issues :wink:). The middle ground is to pull modules together based on their grouping at The Python Standard Library — Python 3.8.13 documentation (I picked 3.8 because the docs are going to be listing the deprecations and thus not grouped in 3.9 and newer). Otherwise it’s open source and if people want a different grouping they can do the work to group it separately.

1 Like

With the cgi import warning in 3.11.0b1 now, it’s become apparent that pip is relying on cgi.parse_header() in a couple of places in order to parameterize HTTP header values in responses. I see that http.client has a parse_headers() function but it needs to operate on a bytestream of the raw response and doesn’t actually chop up the params from the header values anyway. I had hoped requests was a way out, but the bits which looked promising to me there aren’t considered part of its public API.

Is there a recommended alternative to cgi.parse_header(), or is it better to just forklift the code from that module directly into pip?

1 Like

Sorry, I should have kept digging. I found an answer in the old PEP 594 thread which suggests using email.message.Message objects.

Edit: link here in case anyone else gets stumped like I was… PEP 594: Removing dead batteries from the standard library - #14 by mjpieters

1 Like

This is also mentioned in the PEP itself: PEP 594 – Removing dead batteries from the standard library |

Yes, thanks! I’m clearly just going blind. I’m sure I read straight
past that several times over the last year and completely forgot it
was in there, then later failed to even expect something so specific
would be covered within the text of the PEP itself.

Anyway, to wrap it up, this does seem to satisfy pip’s use case for
the cgi module quite nicely.

Should a reference to those be added to the Replacement column of Table 1?

There’s also @jmr’s legacy-cgi · PyPI fork (see upthread).

PEPs, once accepted, are historical documents. I would rather not keep updating that table to list alternatives as that’s potentially unbounded.


Questions about the fate of/a replacement for cgi.parse_headers() (and several other other cgi utility functions) seem to have been by a large margin the most asked-about item deprecated by this PEP on various threads. While the PEP contain a good chunk of useful information that helps address this, it is evidently not that easy to find deep in the body text, particularly for the important case of users coming from the cgi module docs, which (unlike the PEP, as you note) is the canonical, up to date documentation once the PEP is accepted.

Right now, the only mention in the docs of the deprecation, much less potential replacements, is just a note to see the PEP for details, with the link pointing to the top level of such. Therefore, users are going to be scrolling through the PEP to find more information and what they should do about it/replace it with, and the first thing they will come across mentioning cgi is the table, which indicates there is no replacement, nor does it link to the cgi section several pages further down containing that information.

Therefore, the docs should directly link to the relevant sections in the PEP and any stdlib alternatives, and the PEP table should link the respective subsections for accessible navigation. I’ve opened issue python/cpython#92611 and PR python/cpython#92612 to do the former.