Deprecating importlib.resources legacy API

In 2018, users reported a substantial deficiency in importlib.resources API that prevented a common use-case previously supported by pkgutil and pkg_resources - namely, allowing for a directory (or tree) of resources.

The solution to this issue could not be reached in an API-compatible way and required a difficult redesign of the API (the so-called “traversable” or “files()” API), which landed in importlib_resources 1.3 and Python 3.9. This new API provided a simpler design and completely superseded the previous functions and was reviewed by one or two Steering Council members.

In March of 2020, the project announced the intention to deprecate the legacy functions and in May of this year began work to make that possible.

Originally, I’d hoped the deprecation would land in Python 3.10, but time passed and Python 3.10 beta was released before the deprecation could be introduced, so the plan was modified to introduce the deprecation in Python 3.11, with the removal to occur (per policy) in Python 3.13.

This week, the importlib_resources project released 5.3 introducing the deprecation warnings.

Some have argued that:

mass changes are being made again with aggressive deprecation breaking interfaces

I dispute this claim. The changes aren’t aggressive, but are matter-of-fact. The changes are being made slowly and deliberately. The fact of the matter is that users won’t take action until the DeprecationWarnings are raised, which is why I sought to raise these warnings soon to signal to the community that the changes are coming so they can take action.

the new apis are not better than the existing apis

This claim is flat-out wrong. This new API unlocks crucial functionality that the old API could not and does so with much a significantly simpler and more intuitive implementation (re-using pathlib semantics).

the new apis … are creating a lot of churn for developers

Acknowledged. The only alternative is either to not make changes at all or to move more slowly and create more churn for more projects by allowing the legacy implementation to replicate.

Because the importlib_resources project provides a backport of the functionality, it allows users to have more control over the end experience than another stdlib project. For example, instead of silencing the DeprecationWarnings (an acceptable short-term workaround), libraries also have the option to pin to importlib_resources<5.3 to avoid the deprecation. Moreover, because of the backport, the traversable API is available on Python 2.7+, so libraries and applications supporting older Pythons need only require importlib_resources>=1.3 on python_version < "3.9" in order to have compatible interfaces for all supported and recent-sunset Pythons. It’s a lot of work to maintain this backport, keeping changes in sync across the two projects, straddling the gap between pytest and unittest, de-duplicating documentation, testing everything multiple times, and more.

there is no good migration pathway without reintroducing backport packages which is unacceptable complexity.

In this particular case, the change does expand the Pythons that require backports (from Python < 3.7 to Python < 3.9). It only causes one to reintroduce backport packages where support for Python < 3.7 was already dropped, but most packages still support Python 3.6, so the complexity of the migration is small and basically boils down to:

# setup.cfg
- install_requires = importlib_resources>1.3; python_version < "3.7"
+ install_requires = importlib_resources>1.3; python_version < "3.9"
- if sys.version_info < (3, 7):
+ if sys.version_info < (3, 9):
    import importlib_resources as resources
else
    from importlib import resources

And then replace the usage as described in the migration guide.

That hardly seems like unacceptable complexity.

But even if a project wishes not to re-introduce the backport for the intermediate Python versions, it’s possible (and admittedly more complex), but I wouldn’t recommend it. The backport is there to help users avoid that complexity. If a project wishes to refuse the backport, that’s their prerogative and their burden.

Unfortunately, short of waiting until there are no users of Python < 3.9, making this migration is going to cause churn, and delaying the deprecation will only encourage more users to adopt the deprecated behavior.

Most libraries will have had early exposure to the deprecation and future compatibility before Python 3.11 even hits beta. As a result, almost no users are even likely to encounter the deprecation warning in CPython because they will have addressed it in importlib_resources earlier, where there exists a great deal more flexibility for managing the incompatibility. It’s for this reason that I’d even argue that the two-release window is overly conservative for a library like importlib.resources that has an actively-maintained backport (though I’m not making that request here).

As with incompatible changes introduced in setuptools and importlib_metadata and others, I aggressively work to help users work through these important breaking changes to make the transition as smooth as possible. I’m happy to back out deprecations or breaking changes if a suitable workaround cannot be identified.

I’m reluctant to introduce PendingDeprecationWarnings and to introduce multi-year delays in the deprecation process if all that’s likely to do is delay the inevitable churn, increase adoption of deprecated functionality, and introduce more steps in the process.

I hope this post helps clarify the thought process on this recent change and for other backport-supported functionality. I believe the project is honoring the spirit and the letter of the stdlib policy, and I welcome feedback and discussion.

6 Likes

I think the main issue, which I admittedly missed when I first reviewed this proposal, is that downstreams that want to support a wide variety of environments (like flake8) will need to both account for the Python version, and the version of the backport.
This makes the usage of the backport no longer trivial as they will need to keep both usages, files() and the legacy API. Alternatively, they can silence the deprecation warning, but that is also intrusive and painful.

It is unclear to me how many projects this will actually affect, but I would guess not that many, but it is still a big pain point and something we should consider.

I acknowledge that adding a multi-year deprecation timeline is painful to us, but as importlib.resources was not marked as provisional, I think this, or something in between, might be the most reasonable option.

1 Like

Speaking as a bystander, I think the key piece of information I’m missing is that I can’t tell how important it is to get rid of the legacy APIs. Can you elaborate on that? If we kept the legacy APIs around indefinitely, would that add a bunch of maintenance burden, or would it be pretty easy to do?

From a quick skim of importlib_resources/_legacy.py at main · python/importlib_resources · GitHub it looks like all the legacy functions are just 1-2 line wrappers around the new API?

6 Likes

Speaking as someone who just (as in today) implemented a migration to the new API, I agree the new one is clearly better and should be the preferred interface moving forward. I do however feel the deprecation is kind of rushed, and arguably even unnecessary (at least for now). Python 3.7 is supported until June 2023 as per PEP 537, which mean that when 3.11 is released, library authors will have an awkward time maintaining the support matrix and be forced to use the backport, which kind of defeats the purpose of the module being in the stdlib in the first place. In fact, the entire importlib.resources just moves way too quickly for a (non-provisional!) stdlib module; 3.6 is still supported, and the backport we’re pulling in is already generating deprecation warnings for code written to cover 3.7 (not to mention 3.8). Geez. But sorry I got ahead of myself. Again, I think the new API is great, but stdlib needs to be more stable than this, otherwise we might as well not have that module in the first place.

3 Likes

how important is it to get rid of the legacy APIs?

From an implementation perspective, not at all important. The important step is to deprecate those APIs to stem further adoption and provide the (presumed inevitable) signal that these will go away. You’ll notice that nobody cared about this deprecation until the deprecation warnings came into play. There is a lot of value in removing the tests for these functions, as the tests are complicated and messy, so it would be nice to freeze the implementation to prevent regressions.

2 Likes

I used to be of the opinion that it makes sense to delay a deprecation until a replacement has matured. I’ve since learned that a replacement often doesn’t mature until the functionality it replaces is marked as deprecated, creating the incentive for adoption. As a result, my default action is to deprecate a behavior as soon as it’s known to be deprecated and there exists a presumed-suitable replacement, as long as one is prepared to back out the deprecation or iterate quickly to correct for any shortcomings in the migration to the replacement.

In the case of importlib.resources the implementation has had time to mature, being almost two years old and present in two Python versions.

The CPython backward-incompatible guidance policy doesn’t give any direction regarding the maturity or pace of changes other than for the “two minor release” window.

If library authors are concerned about the support matrix, they have an easy option: just rely on importlib_resources>=1.3. It’s simple and straightforward, and when they get to a world where only Python 3.9 and later is supported, they can simply switch to the stdlib.

Agreed. It would have been preferable, knowing all that we know now, to leave importlib.resources out of the stdlib until the traversable API was present. Perhaps the module should have been declared as provisional. Unfortunately, no one recognized the crucial deficiency (and other more minor deficiencies) of that implementation until it was too late. I don’t blame anybody, but am merely working within the constraints given.

It sounds like there’s widespread consensus that this project should hold itself to a standard higher than that given in the policy. In particular, I believe this is what is proposed for this change:

The backward-incompatible change should not be made until the replacement is available in all supported Python versions, and the DeprecationWarning should not be introduced until as late as possible (two minor releases prior to the removal).

Should this policy be applied to all backward-incompatible changes in the standard library? If not, under what circumstances should this more conservative approach apply? Should it also apply to backports of modules in the standard library?

Separately, the guide does provide for having a PendingDeprecationWarning for when a change is not known when it will be removed. That could be employed here. Unfortunately, the policy doesn’t give guidance on when that warning should be employed.

There’s some guidance in this thread, but it basically boils down to don’t use PendingDeprecationWarning. Although it does look as if the proposed _deprecate decorator would issue PendingDeprecation until N-1 and only issue Deprecation on N-1, somewhat in conflict with PEP 387.

Thanks to the flexibility of third-party packaging, there’s quite a bit of control over versions. Third-party packages can iterate quickly and empower libraries and environments to choose their path (silence, pin, migrate, …). Admittedly, it’s harder for libraries that want to avoid conflicts with other libraries that may have different notions (e.g. flake8 couldn’t pin to importlib_resources==5.2 but probably could pin to importlib_resources<5.3, at least until some crucial feature was added to 5.4+).

1 Like

Based on the feedback so far, I’m all but convinced to move forward with the proposal to back out the DeprecationWarnings and replace them with PendingDeprecationWarnings as proposed here. I’d like to get some feedback from the SC on my questions above.

2 Likes

It’s gone unsaid, so I’ll say it. The advantage of having the module in the standard library is that at some point, the pace of change should slow and the stdlib can become the primary/only use.

1 Like

If it’s not important to get rid of the legacy API, then I’d suggest just… not getting rid of it, at all?

Make the new API the one that gets highlighed in the docs and tell people that it’s better, with the old API relegated to a section called “Legacy interface” or something. That accomplishes the goal of getting people to use the new API in new code, without forcing churn for churn’s sake.

6 Likes

I’m not going to try to covince you since it’s quite clear you’ve made your decision and are not going to accept rejections either way. So may I know when this slowing down will happen? Because I’m not going to use that module and will actively oppose people using it until then.

2 Likes

Python 3.10 is when the module stopped being provisional.

1 Like

I’m sorry to hear you’re resigned on the matter. I did not intend to bully or intimidate or pull rank. I was intending to share my perspective and convince you of the position I hold, but I’m open to change my position as well, and based on the feedback I’ve received thusfar, I’ve already shifted substantially (as I mentioned, I’m already “all but convinced” not to deprecate this behavior).

My main reluctance with simply adopting the approach above is that it implies constraints on development that are far more conservative than the documented constraints. If backward-incompatible changes must be delayed until the functionality they replace is in all supported versions, that should be the policy for all libraries and not just those under active development with backports.

After all, if this library did not have a backport, it would not have had such substantial early adoption, and the issues would not have been identified as soon, and the usefulness as a replacement for pkg_resources would take many years of evolution.

My opinion is the churn is inevitable (unless you know someone who can volunteer to implement the perfect solution the first time), so let’s provide as much freedom as we can downstream to control the speed of evolution while advancing the state of the art at the policy-prescribed pace.

1 Like

That’s true for importlib.metadata, but importlib.resources, about which this topic is focused, was never provisional (AFAIK). I get these modules mixed up all the time.

3 Likes

It’s not about what you intend to do, it’s that you’re not changing the decision anyway, so I don’t feel I need to say anything more. You are free to do what you want, and users will adapt because we have to.

The more important question on this topic to me, as a user, remains unanswered, however. When can I expect a reasonably stable API from importlib.resources? (Not the third-party importlib-resources; that I know I can depend on the version number.) Is it when the module can fully replace all resource access functionalities of pkgutil and pkg_resources, or are there other criteria? Does importlib.resources cover all functionalities of the two legacy libraries now, or what are still left uncovered? How can we be sure this churn does not happen over and over again?

I guess my main problem is that you’re admitting importlib.resources is not yet “stdlib level” stable, but did not say when users can expect that stability, nor display efforts to work toward that in this thread. So as a user, my only choice here, if I want stability, is to pretend the module does not exist in the stdlib right now. If that’s the case, I would like to know when I can stop pretending and start making use of that module.

here’s the code necessary to be warning-clean with the introduced warnings to load a file’s contents “f” from package “p” with minimum dependencies (not introducing a backport package when the stdlib module exists) and maximum compatibility for the currently supported python versions (python3.6+)

dependencies:

importlib-metadata;python_version<"3.8"
importlib-resources;python_version<"3.7"

code:

if sys.version_info >= (3, 9):
    from importlib.resources import files
    s = (files('p') / 'f').read_text()
elif sys.version_info >= (3, 7):
    from importlib.resources import read_text
    s = read_text('p', 'f')
else:
    from importlib_metadata import version

    if version('importlib-resources').split('.') >= ['1', '3']:
        from importlib_resources import files
        s = (files('p') / 'f').read_text()
    else:
        from importlib_resources import read_text
        s = read_text('p', 'f')

(and even this has some potential bugs due to unstructured version comparison)

if the warnings are removed and the legacy apis are allowed to live until 3.9 is EOL this is much simpler to do in a warning-clean way:

importlib-resources;python_version<"3.7"
if sys.version_info >= (3, 7):
    import importlib.resources as importlib_resources
else:
    import importlib_resources

s = importlib_resources.read_text('p', 'f')
7 Likes

I may be wrong here¹ but isn’t the problem more that importlib.resources is sticking to the letter of the deprecation policy, but maybe not to the spirit? The majority of stdlib modules don’t change much, so they aren’t pushing the limits of what the policy allows. That sets expectations of a higher level of stability than the rules actually mandate, and what we’re seeing is a violation of those expectations.

I feel like the stdlib deprecation policy is fair as a baseline, but it does not really take into account the realities for people who maintain libraries across many Python releases. I don’t think it would be unreasonable for Python users to expect the deprecation policy to say that it should always be possible to write code that works across all supported Python versions, without having to gate large chunks of code behind a version check, and without needing an external backport module. But that’s not the current policy, and it may be too strict to allow the level of change core Python and the stdlib needs to continue growing. That’s a larger question, though, and should be one for the the SC and python-dev at least, and honestly IMO it should be something that the wider Python community has a say in as well.

The existence of a backports module is also a complex question. Ultimately, if this functionality is available on PyPI, why does it need to be in the stdlib at all? But that same question applies to all stdlib modules, and is essentially the “should we debundle the stdlib” question. I’m firmly in the “keep the stdlib” camp on that one, largely because I don’t think dependency management in Python is good enough that we can simply dismiss the idea that people might have trouble with using a backports module. It’s also worth remembering that on python-ideas, we routinely block proposals for new stdlib functionality with the argument “develop it on PyPI first, and when it’s stable, propose it for the stdlib then”. The debate we’re having here definitely feels a bit like double standards, in that context :slightly_frowning_face:

As I said, I don’t have a strong opinion on what importlib.resources should do. But now that importlib.metadata is no longer provisional, I definitely would be speaking up if this same issue arose with that module - so my sympathies here lie with the people asking for more stability.

¹ I don’t use importlib.resources much, so this issue hasn’t hit me directly, but I do use importlib.metadata. The latter is also subject to a lot of API change, although until 3.10 it was provisional, so the two situations aren’t that comparable.

I think you could make the required code simpler by making the dependency be on a version of importlib-resources that provides the new API. Then you wouldn’t have the second check at all, and you could do something like (untested)

try:
    from importlib.resources read files
except ImportError:
    from importlib_resources read files

I’m sure you’ve done a lot more research into this than I have, but can you clarify why this doesn’t work? I feel as if people (like me) who aren’t quite understanding why using the backport is such a problem, may be missing something here.

2 Likes

reintroducing the backport, especially with constraints makes it difficult to integrate with other libraries (especially with conflicts being forbidden by pip).

2 Likes

I think “baseline” can imply deviations in both directions. But the policy but a minimum: it’s the fastest path you can take, save “extreme situations” formalized as a Steering Council exception.
I don’t think PEP 387’s fast path is right for most changes. It’s taken frequently, because volunteers are motivated to see their changes as soon as possible. But I think users would be better off if it wasn’t taken as often.

2 Likes

Sorry, I meant “baseline” in the sense of “minimum” so yes, I agree with your comments.