Formalize the concept of "soft deprecation" (don't schedule removal) in PEP 387 "Backwards Compatibility Policy"

DanielNoord · June 30, 2023, 9:25am

Not sure if this would be the right topic to continue the discussion, but I do see value in linters being able to distinguish between different kind of “you need to take action” deprecations.
Making that available in a somewhat parse able way would still be valuable I think.

vstinner · June 30, 2023, 9:44am

I submitted PEP 387 change adding Soft Deprecation to the Steering Council: Update PEP 387 Backwards Compatibility Policy: Add Soft Deprecation · Issue #199 · python/steering-council · GitHub

The ongoing discussion about exporting the list of deprecated APIs is not directly related to the idea of formalizing the concept of soft deprecation.

CAM-Gerlach · June 30, 2023, 10:21am

Ah, good point. To handle this, aside from linters parsing the JSONs for multiple Python versions, we could just copy the JSON for any removed APIs to a checked-in removed.json (combining them at docs build time with the generated deprecated.json in a single file, if desired). A fairly simple local script (with a make target and invokable via pre-commit) and CI check could ensure that anything removed from the generated deprecated.json (checked against the latest upstream version for the branch) automatically gets added to removed.json. So adds a small amount of (mostly-automated) overhead, but less than maintaining a whole separate module, I’d think.

Yeah, that occurred to me as well. Some applications may prefer the formatted version (e.g. if used within our own docs, other reST files, etc.), but for those want plain text, given existing deprecation messages are typically short and don’t include much complex formatting besides roles, italic, bold and literals, we should be able to fairly reliably provide a “plain text” version with this stripped with a few lines of postprocessing.

Yeah; inspired by Greg’s comment, an optional :reason: tag with one of a pre-defined set of reasons (e.g. alias, insecure, unsafe, obsolete, superseded, notdeveloped, etc), each with a standardized description displayed on hover in the rendered docs and available to linters in the JSON, is part of my proposal, alongside a standardized, explicit categorical indication of current removal plans.

It’s somewhat tangential, yeah, though IMO determining how we’re going to communicate soft deprecations to users in the docs and elsewhere, and how users are expected to discover them, seems to be an important part of the justification and proposed implementation of soft deprecations (and was the original motivation for each of our proposals).

sinoroc · June 30, 2023, 6:31pm

I am not sure I understand. Would 3rd party libraries be able to generate such a JSON file to advertise their own “soft deprecations”?

Pierre-Sassoulas · June 30, 2023, 8:08pm

To add to what Daniel said, as a pylint maintainer I also prefer data file over an API.

I’d be glad to provide better, more fine grained, information and to be able to populate our internal data structure with a script directly from a source of truth : it would be less error prone than reading the release notes and transcribing what we understood. It would also permit to not raise warning for deprecation without removal like for optparse in pip while still being warned. But linters can be disabled and won’t leak in downstream libraries. I think that pending removal should be a warning in both python and linters though.

Also I welcome any upstream clarifications on deprecation pending removal, deprecation with no pending removal / soft deprecation / obsolete / obsolescence. (It seems for some “deprecation” meant what “soft deprecation” meant for others. Is there a difference between deprecation without pending removal and soft deprecation ? Definitions are still unclear for me after reading the thread). Right now in pylint everything is labelled “deprecation” (which until reading this discussion implied pending removal for me).

CAM-Gerlach · June 30, 2023, 10:59pm

The present proposal focuses on CPython, but conceptually there isn’t any reason they couldn’t by either generating it themselves and hosting it e.g. at the root of their docs, or if we spin our deprecated-removed directive out as either a third-party Sphinx extension or it is accepted for inclusion as a built-in extension under sphinx.contrib, and projects adopt it in place of the existing deprecated.

Then, linters could either include the data for popular tools, someone could set up a simple central registry they could query/use, and/or linters could offer a config option where users could add the root docs URLs for desired projects that offer this for the linter to query at runtime, sort of like how Intersphinx works for Sphinx docs. But that’s probably best left as a followup discussion.

Yup, the extensions to the existing directive, and the resulting user-rendered output and JSON file would explicitly specify both the reason (by category and also in prose for UI text) for the deprecation, and also the removal plans, if any. This could allow linters or their users to easily filter all deprecations by urgency as well as by type.

Conceptually yes, as the former warn at runtime and the latter do not, and these could be reflected in the new values and their associated descriptive hover text that I propose to allow in the second (removal version) arg of deprecated-removed, notplanned (there is no current plan to remove the API anytime soon) vs. notscheduled (removal is planned at some point, but a removal specific removal version is not currently set, and per the deprecation policy would need to be a minimum of three years/feature releases after the current post-alpha version).

kknechtel · July 1, 2023, 1:02pm

Maybe I’m missing something here with the optparse example. Suppose it were removed from the Python standard library. Why exactly do the devs need to be warned particularly far in advance about that happening? Rather than having to plan ahead and make a schedule for converting everything to argparse, couldn’t they just… clone the last version of optparse known to work, and include it in the project as a vendored dependency?

vstinner · July 1, 2023, 3:32pm

I would suggest converting any deprecation which has no removal deadline to soft deprecations, especially the ones which don’t emit DeprecationWarning. I don’t see the point of deprecating an API if there is no plan to remove it: it’s just a way to annoy everybody with DeprecationWarning and create confusion.

If you want to deprecate an API but the code must not be removed: just “soft deprecate” it, so it will be clear that there is no plan to emit a warning and no plan to remove the API.

I dislike the current status quo: some deprecations clearly communicate on the associated planned removal, whereas some deprecations just emit a DeprecationWarning, and some don’t emit a warning and don’t have a scheduled removal. So “a deprecation” has no clear definition: “it depends”.

Previously, there was even the concept of the Sword of Damocles. Ok, for now, you are safe, but BE WARNED: suddenly, one day, as soon as the Python community decides to invoke the HEAVY HAMMER “Python 4.0”, it will punch you hard in the face, and everything that you used will suddently break (scream, cry, fear). I fixed all “pending Python 4.0 incompatible changes” to either schedule them as soon change, or unschedule these changes.

I dislike the concept of “Python 4.0 must be as painful as possible”. For me, it must be the opposite: migrating to Python 4.0 must be as smooth as possible. As smooth as migrating from any Python 3.x to Python 3.x+1.

kknechtel · July 1, 2023, 4:32pm

… But in this case, which version do we call 4.0? And why?

steve.dower · July 1, 2023, 5:54pm

You’re the first to ever suggest this concept. All we’ve tried to do is defer certain changes until we decide it’s worth breaking the world. In effect, they were a soft deprecation with an unscheduled “scheduled” removal.

Instead, moving to 3.(n+1) is more painful, because those changes were brought up to a sooner release. Perhaps it was worth hurting our users like that for the benefits of each individual change? But it was definitely a move from a soft deprecation into a hard deprecation.

There’s no need to denigrate the implementation of soft deprecations we were already using while you are in the process of proposing exactly the same thing.

encukou · July 3, 2023, 9:21am

Lately I’ve been thinking about effects of our policies on users – specifically, them needing to litter their code with if sys.version_info ≶…: or ifdef Py_HEX_VERSION ≶… to avoid warnings for their users.
If we don’t want to make them do that, but also want to remove old API eventually, and don’t want a massively breaking “4.0”, the process for API we’re in no rush to get rid of it could be:

Leave old API be until the last version without a replacement goes EOL.
Then, turn on DeprecationWarning and schedule removal – perhaps in another decade or so.

We’d need to keep a list. We kinda already do, in What’s New, but something more directly usable by linters would be better.

That’s Soft deprecated. R also has superseded, which sounds pretty useful:

A superseded function has a known better alternative, but the function itself is not going away . A superseded function will not emit a warning (since there’s no risk if you keep using it), but the documentation will tell you what we recommend instead.

(Ideally there’d be an easy way for linters to flag these in new code…)

And defunct, which also sounds useful:

Defunct comes after deprecated. In most cases, a deprecated function will eventually just be deleted. For very important functions, we’ll instead make the function defunct, which means that function continues to exist but the deprecation warning turns into an error. This is more user-friendly than just removing the function because users will get a clear error message explaining why their code no longer works and how they can fix it.

I imagine the API still has docs entry, so old URLs and cross-references work, but the description has been replaced by porting instructions.
I love that idea.

Speaking of docs, I also like MDN icons (see e.g. the ToC sidebar for Document), as a compact way to flag the status in overviews:

= deprecated
= non-standard/implementation-specific
= experimental/unstable

oscarbenjamin · July 3, 2023, 12:11pm

The list is pretty good but most directly useful for downstream maintainers would be a very clear explanation of how to update the code. Taking a random example:

zipimport: Remove find_loader() and find_module() methods, deprecated in Python 3.10: use the find_spec() method instead. See PEP 451 for the rationale. (Contributed in gh-94379.)

There are links here for more information which is good. The statement that find_spec should be used is clear. Most likely though I have some code using find_module and I just want to know what is the equivalent code using find_spec and I want to know whether it is precisely equivalent. Here I’m faced with needing to go read docs, issues, possibly even a PEP to understand how to make the change but probably the change needed is fairly mechanical and usually any nontrivial cases are things that I just don’t care about in context. Since I have never looked at the zipimport module before it will take me some time to learn about it just to establish that I can make a mechanical change without breaking anything.

It is important to remember that often the maintainers/contributors who will make updates in downstream codebases are not the original authors of the code that needs to be updated. They might not initially know anything about the functionality that is being deprecated or even the particular downstream code that uses it. This is the situation in which making these updates is most troublesome and time consuming because it takes time to learn about some upstream/downstream modules/features that you didn’t previously know about (and possibly don’t really want to know much about). Notes about deprecations tend to be written with the presumption that the reader already knows something about the code/feature that is being deprecated but this is often not the case.

Ideally I would like to have something like this:

The old_func function is deprecated because it is unsafe. If code previously used old_func(a, b) then an exact equivalent would be new_func(b, a, unsafe=True). A better approximate equivalent would be new_func(b, a) since that is not subject to the vulnerability yyy that was the reason for deprecating old_func. If xxx feature of old_func is not needed then there is no reason to pass unsafe=True and the code should simply be updated to new_func(b, a).

The things that make it a lot easier (even just to review a PR that includes a fix) are clear statements to the effect of:

Most likely this is precisely how the code would be updated (old_func(a, b) -> new_func(b, a)).
This new code is (or is not) precisely equivalent to that old code.
Changing the code does (or does not) affect these aspects of its functionality/behaviour.
The differences that you might (or might not) care about are X, Y and Z.

effigies · July 3, 2023, 12:58pm

Strong +1. Maintaining libraries that support Pythons until EOL, this would make it much easier to resist calls to adopt things before the upgrade path is absolutely trivial, which would reduce social/mental overhead in addition to if sys.version_info ... boilerplate.

Edit: I would even request an extra minor version of breathing space. 3.7 just went EOL, 3.12 will come out this year. It would be nice if things deprecated in 3.8 weren’t removed until 3.13 so that cycling 3.7 out and 3.12 into our CIs can be separate from migrating APIs.

barry · July 3, 2023, 6:57pm

One consideration that’s missing is maintenance of the old, deprecated, defunct, eventually-to-be-removed APIs. What if there are bugs ^[1] in the old APIs, but nobody wants to fix it, or it’s too difficult? Do we just let old APIs rot unmaintained, and does that do a good service to the downstreams that are using those APIs? Perhaps we need a strong “Unmaintained” label (in the docs, code, warnings or whatever)?

I’m assuming that any security vulnerabilities in “unmaintained” APIs will get fixed with a higher priority, but even that may not be the case. ↩︎

vstinner · July 3, 2023, 7:35pm

The Steering Council approved my PEP 387 change. I merged my change: PEP 387: Add Soft Deprecation section (#3182) · python/peps@57b1d94 · GitHub

“The SC agrees with the proposal and accepts the PEP update”: Update PEP 387 Backwards Compatibility Policy: Add Soft Deprecation · Issue #199 · python/steering-council · GitHub

This discussion here was very productive. It’s good to have feedback. Obviously, we have to define some trade-offs and limits when we deprecate APIs, especially when we plan to remove it.

Soft deprecation may be a way to communicate that a module is no longer maintained and should be avoided.

gpshead · July 3, 2023, 7:51pm

Unmaintained is hard to define. It is best to assume that everything is unmaintained. Until it isn’t. This is even true for non open source software.

We rarely have what I’d personally consider official maintenance of specific things within Python. We don’t offer Service Level Obligations of any form in terms of how soon a bug in anything will be triaged to determine its relevance let alone fixed or a PR claiming to fix something for us will be reviewed let alone decided upon. (And implied: we can’t offer SLOs)

So if you wanted to define Unmaintained… I’d start by defining what Maintained means. I doubt these are terms we would ever collectively agree on the specifics of.

methane · July 3, 2023, 7:52pm

If old API is not so bad (e.g. typing.List), I’m +1. We need to track “deprecated, but not emit DeprecationWarning for now.”

But some deprecated APIs have stronger reason to be deprecated instead than just there is a clean new one. For example:

Not correct (e.g. doesn’t follow RFC), but can not be corrected for backward compatibility.
Unsafe. And it can not be fixed because of its API design.
Even the API itself is not unsafe, its API design is easy to produce bugs/vulnerability.
Inefficient.

Not emitting DeprecationWarning makes users tend to keep using such old bad APIs even in new code.
We suppress DeprecationWarning for end users already. I don’t want to be more silent about bad APIs even if we do not rush to remove it.

encukou · July 3, 2023, 8:12pm

Yes and yes.
Too many times, “bugs” are in rare edge cases that many users don’t really care about. And if they start caring, because circumstances changed ortheir library suddenly got popular, or someone benchmarked it, then good deprecation docs can tell them what to do.

I am talking specifically about the benign changes – renames, things with a two-line replacement, things that became no-op.
If an API is wrong/dangerous for most of its users, that’s another thing.

charliermarsh · July 4, 2023, 12:03am

Speaking on behalf of Ruff: for what it’s worth, I agree with @Pierre-Sassoulas and @DanielNoord in that I’d be perfectly happy with (and have a slight preference for) a JSON file, though the difference for me is minor enough that I’d probably just vote to run with whatever is easier for the maintainers. Either way, we’re likely going to preprocess the data into a format that fits our needs (via a Python scrip), so it’s a minor difference to me as to whether we grab a JSON file or query from the standard library in some way.

brettcannon · July 4, 2023, 4:01am

To be clear, this random example is one which can’t have a 1:1 replacement because the old API didn’t fit into the import APIs well and everyone used the old API in varying ways.