RFC: PEP 608: Coordinated Python release

Hi,

Let me try to reply to everybody.

First of all, this PEP doesn’t propose drastic changes: most items of the PEP were already done before Python 3.8.0 final release, but it seems like most people didn’t notice :slight_smile:

  • Most if not all selected projects were compatible with Python 3.8 when Python 3.8.0 has been released. This happened because different people worked hard to make this happening.
  • Python core developers and maintainers of their projects are testing various projects on the next Python, especially during the beta phase
  • Bugs and regressions are reported to Python upstream.
  • Incompatible changes are discussed on python-dev, the bug tracker, Twitter, on projects bug tracker, etc. Some incompatible changes have been reverted during Python 3.8 beta phase.
  • Some bugs were considered as serious enough to get the “release blocker” priority which blocks a release according to the PEP 101. As far as I know, all these release blocker issues have been fixed.
  • There are already multiple CIs running different projects on the next Python. For example, at Red Hat, we rebuilt Fedora on Python 3.8 beta: it’s not just a “few projects”, but the whole operating system: more than 200+ Python packages (I don’t know the exact number, maybe it’s 500+, Miro knows that better than me, maybe 200 was the number of packages broken by Python 3.8)
  • The release manager is already free to change a release blocker issue to deferred release blocker, or even “reject” the release blocker priority if they consider that it’s not severe enough.

The Python 3.8 beta phase discovered the PyCode_New() C API change: it was decide to revert this change, and add a new function instead.

This PEP is not about doing new things, but more about better communicating around this work which is already done, and better coordinate.

IMO the counter-productive part is more that each Python release breaks tons of Python projects :slight_smile: This PEP proposes a concrete solution to make Python releases less painful for users.

Many users remind the pain of upgrading from Python 2 to Python 3. We are close to have Python 2 behind us. It’s time to look how to prevent such situation to happen again.

It seems like the PEP doesn’t clearly explain when the compatibility is tested and who is going to pay attention to this.

I don’t expect that a single person, the release manager, will handle all issues. It’s the opposite: I would like to involve all Python core developers and all maintainers of all selected projects in the process. It’s more a human issue than a technical issue. That’s why the PEP title is “Coordinated Python release”: we need more interactions and coordination than what we have currently.

I don’t expect that selected projects will only be checked the day before the expected final release. No. My intent is to check these projects every day using a CI: see the “Distributed CI” section, where CI stands for Continuous Integration :wink: The formal part is that the release manager is intented to send a report at each beta release. I expect that the release manager will point to projects which need most help to be updated.

One practical issue is that project dependencies can evolve more quickly than the PEP will be updated. So I chose to only select dependencies which are the most likely to be broken, but also select dependencies which are the most commonly used. For example, urllib3 is the top #1 the most downloaded library on PyPI. If urllib3 is broken, you should expect a huge amount of projects to be broken on the next Python. On the other side, I chose to ignore distlib which is very specific to pip and packaging.

Obviously, if distutils is broken by Python 3.9, pip tests will fail, and so the Python 3.9 will be indirectly blocked by distutils, even if it’s not explicitly listed in selected projects.

I tried to explain that, but very shortly, in the “How projects are selected” section: “Some dependencies are excluded to reduce the list length.”

About “(most notably requests)”: requests is explicitly listed as a “project” in the PEP list. I’m aware of pip vendored dependencies (src/pip/_vendor/).

Obviously, I’m open to discuss the exact list of selected projects :wink: It’s a Request For Comments and a draft PEP :slight_smile:

With my Red Hat :wink: , the “compatibility” check basically means that building a Fedora package does not fail, knowing that tests are run as part of the build process. But this definition may be too specific to Fedora, since Fedora uses specific versions, which can be different than versions chosen by the upstream project for their CI.

If a project has a CI running on the next Python, the CI is expected to pass: it must be possible to install dependencies, to build the package (ex: build C extension), and its test suite must pass.

The CI can be run by the project directly, or it can be a CI run by Python core developers, or both. That’s why I use the “Distributed CI” term, instead of requiring an unique CI.

For example, when I discussed with numpy developers, they told me that they like to control how dependencies are installed: which version, which OS, etc. For example, the OpenBLAS version.

Obviously, having multiple CIs testing different configuration are not counter-productive: they can detect more bugs. But we will have to decide at some point which CI is the one used to validate a Python version :wink: Maybe this choice should be delegated to each selected project? I guess that the natural choice will be upstream CI run by the project.

If Django decides to not support Python 3.9, maybe it’s a strong signal to Python core developers that something gone wrong and that we have to discuss to understand why Django doesn’t want to upgrade. Maybe we are putting too many incompatible changes and it’s time to slow down this trend?

Maybe Python core developers and other volunteers can offer their help to actually port Django. This happens very commonly. It’s common that core developers who introduce incompatible changes propose directly pull requests to update projects broken by their change.

It happened for the new incompatible types.CodeType constructor: Pablo (and others) proposed different pull requests. It also decided me to introduce the new method CodeType.replace(): projects using it will not longer be broken if CodeType constructor changes again (gets a new mandatory parameter). I proposed pull requests to different projects to use it.

If Django doesn’t want to support Python 3.9, doesn’t want to accept pull requests or pull requests cannot be written, well, the Python release manager should be able to exclude Django from the selected projects. I expect that such decision will be a group decision.

Maybe ignoring Django is fine. But what about pip or urllib3? What if Python 3.9 is released even if we know that pip or urllib3 don’t want or cannot support Python 3.9? Is Python 3.9 without pip/urllib3 still relevant? That’s also the question asked indirectly by the PEP.

For the specific case of Django, maybe Django code base is too big and Django release cycle is too slow, to include Django in selected projects. I’m fine with dropping it from the PEP if you prefer. But it would be nice to have clear rules to include or not a project.

If you consider that the selected projects list is too long, we can make it way shorter. Maybe Python 3.9 should start only with pip and nothing else?

Ok, it’s now time for me to introduce you a very experimental project that I started a few weeks ago: https://github.com/vstinner/pythonci

This project is supposed to be a way to test the selected projects on a custom Python binary with custom Python options. For example, using -X dev or -Werror (passed as command line arguments or as environment variables).

I consider the project as experimental because I have different issues:

  • First, I chose to hardcode commands used to install dependencies and to run the test suite of a project. I’m not sure that this approach is going to scale. I was scared when I saw the complexity of the tox.ini project of the coverage project.
  • I wrote a task for coverage which uses tox, but I failed to run coverage with the specific custom Python binary (the task ignores the custom Python and uses “python3.7” instead).
  • Python 3.9 already introduced incompatible changes which cause the job to fail early, while installing dependencies. In short, pip is currently somehow broken in Python 3.9. In fact, pip has been fixed (bundled html5lib no longer uses collections ABC aliases but collections.abc), but pythonci runs an old pip version which isn’t compatible with Python 3.9… I’m not sure why, I should investigate.
  • All jobs fail very early using -Werror because even pip emits many warnings (not only DeprecationWarning). My plan is to experimental to only treat DeprecationWarning as error… But pip fails with -W error::DeprecationWarning: again, because pythonci picks an outdated pip version.
  • I wrote pythonci for different usage: test the master branch of Python with a patch, test a project with -X dev, test a project with -Werror.

By the way, pythonci includes patches on pip and setuptools to fix a few warnings, to be able to experiment -Werror.

In short, I would say that right now, Python 3.9 is in a bad shape: it’s barely usable, most basic functions like pip are broken… Maybe I’m wrong and it will be fine in practice.

All these issues also decided me to propose this PEP.

I don’t think that a single CI can reply to all open questions. Some jobs may only be relevant to Python core developers.

For example, I would like to drop the “U” mode of the open() function: https://bugs.python.org/issue37330 But I have no idea how many projects would be broken by this change… 4 months ago, when I tried, even building Python was broken… because of Sphinx… because docutils was ignoring the DeprecationWarning since Python 3.4. Moreover, when I reported the issue to docutils with a patch… I was told that docutils was already fixed, but there was no release yet! (A new docutils version has been released with the fix in the meanwhile.)

It would be great to have a tool (like pythonci?) to check that the selected projects still run fine while working on an incompatible change: run the tool manually before merging a PR.

This is not a theoretical issue: pip was broken by the removal of collections ABC aliases. It was an issue in html5lib which has been fixed: a new compatible pip version has been released in the meanwhile.

About DeprecationWarning, one reason why developers ignore them may be that it’s not easy to separate warnings emitted by the stdlib, and warnings emitted by third party code.

At least for Python core developers, it would help to run selected projects with DeprecationWarning treated as errors, but only for warnings emitted by the stdlib.

Would it be possible to develop a special warnings filter for that?

1 Like

(Reply to Paul Moore’s email on python-dev)

Le ven. 25 oct. 2019 à 17:37, Paul Moore p.f.moore@gmail.com a écrit :

It’s also worth noting that even with pre-release testing, there’s
still the non-trivial problem of getting any necessary fixes
implemented and co-ordinated. That’s a people-management issue,
though, and IMO needs handling flexibly and in a way that’s
sympathetic to the fact that most of the projects involved are
volunteer-based and resource starved.

It seems like you understood the deep roots of the PEP :slight_smile:

Many Python projects are understaffed, whereas Python core developers are putting incompatible changes into Python. It’s done without any kind of coordination. Once Python is released, it’s too late to revert changes.

Incompatible changes should be reduced to ensure that understaffed projects are able to handle them.

As I wrote in my previous comments, in practice, core developers are already helping to update many core Python projects for incompatible changes.

If selected projects fail to be updated to next Python, we will be in a very bad situation which smells more and more like Python 2 => Python 3 failure.

The question is more why Python is breaking projects at each release, and if it’s worth it.

One of my biggest reservations about Victor’s proposal is that it’s
basically removing flexibility and demanding extra testing, with
little or no explanation of how we’d make that sustainable.

If we stop to put incompatible changes in the next Python, the PEP 608 becomes basically free :slight_smile:

We have to find the right balance for incompatible changes.

About the “extra testing”, I replied in previous comments.

2 Likes

I would like to propose PyPA & PyCQA projects & celery to be considered as well :slight_smile:

I would think the PEPs are over-optimistic of what can be achieved for Redhat/Mac/Windows.

Limiting the focus on the common ultra-basic would already be great:

  • zero constraint on Python delivery, just get community focus/efforts on the bottlenecks to get the whole eco-system be ready sooner (which can vary at each version or beta cycle, depending of PEPs)
  • suspect “always a bottleneck” : C.I. Chains + Cython + Numpy + Jupyter Notebook-
  • ugly patches are ok in alpha/beta to keep the bottlenecks away for the rest of the community,

and maybe some environments shall be droped, at least at alpha/beta stage, as increasing the effort for ever shrinking effect:

  • Windows 7/8, Windows 32 bit (I see 32 bit mostly dying with Windows7),
  • Old Mac/Linux versions.

I think this is an excellent idea in general, even if it will still take some time to hammer out the details. Testing a some core packages to be compatible with the new release is very good for finding allowing libraries to know what to adapt to (generally, projects that are actively maintained would surely be willing to work on removing deprecated methods, or at the very least accept pull requests to that effect), and for having a wider range of projects available at release time.

Most projects (even those as big as pandas) do not test against python master, and this is something that should conceivably be done for CPython itself.

This mode of development is also not unprecedented, for example there’s the community build for scala or crater runs for Rust.

IMO, minimal list to start with could also include Sphinx, particularly since we depend upon it for building Python and our documentation.

Also, I think it would be helpful to have more than one category of projects that the list would include, such as a “suggested release blocking” section that includes only a few stable projects and a “non-blocking” section that includes significantly more packages. This would allow us to expand upon the list further over time, without increasing the maintenance cost as much for each addition. It could potentially be only two sections or perhaps many sections with increasing compatibility priorities.

I would propose that new packages added to the list could start in a lower compatibility priority section; moving up or down depending on if they are proven to be stable and responsive to fixes over time. Regardless of the priority level of any package, we would of course maintain the discretion for issues to be a release blocker or not (as the PEP suggests).

These different sections could help give us an idea of where to focus our efforts, and give us a better understanding of compatibility issues. It’s far more meaningful if a known “stable” package suddenly breaks, in comparison to one that is new to the list or commonly has compatibility issues. In a way, it’s not entirely different from how we think of the buildbots.

My concerns about the ambiguity here remain. As a pip core developer, my question to you in that case would be, precisely what commitments are you expecting from the pip developers if this proposal were to be accepted? What would change for pip?

  • Are you expecting us to add the dev version of Python to our CI? Is “whatever Travis/Appveyor/Azure happen to have available” sufficient, or will we be expected to add a CPython build to our CI?
  • Are you expecting us to re-run our CI when a new Python release occurs (at the moment we only run CI when a pip commit happens)?
  • Are you expecting us to run CI repeatedly on our release tags and/or master? At the moment we run CI on commit but we don’t run CI again unless another commit occurs.
  • Are you expecting us to let the CPython project know if our tests fail? Do we need to triage such failures to confirm if they are caused by a CPython change or something else (environmental issues, pip relying on undocumented behaviour, etc)?
  • Who do you propose communicates these issues to our vendored dependencies? Are you OK with our policy that we generally don’t patch vendored dependencies, but wait for a downstream fix to be released?
  • Do you have any expectation that we prioritise fixing failures that are triggered by CPython changes? What if our test failures are delaying a release?

I could go on - there are many questions that would need answering here, if this were to become a formal part of any process. Of course, if it’s simply a general “good intention”, that we try to ensure that pip works when a new Python version is released, then (a) that wouldn’t involve much extra work from the pip developers, but (b) it’s what we already do… And general “good intentions” don’t need a PEP :wink:

But as it stands, I’d personally be against pip agreeing to be part of this PEP. (The other pip developers may have different opinions - we’d have to discuss this internally and come up with a project response if it came to the point of signing this PEP off).

1 Like

Hi Paul,

Thanks for these interesting questions. It seems like the PEP needs some clarification :slight_smile:

No.

It’s better if a selected project has a job to test the project on the next Python (the job doesn’t have to be mandatory), but it’s not required by the PEP.

My plan is that Python core developers (at least me) will add a CI for projects which are not tested on next Python yet. Maybe maintainers of selected projects will be kind enough to help us on this task :wink:

No.

For the short term, my plan is more to have a script similar to my experimental https://github.com/vstinner/pythonci project which would be run manually. Only the Python release manager is expected to do such manual check.

But for the long term, it would be better to have a CI, so run continuously.

No.

For projects less active than CPython (ex: if the CI is not run daily), we can do manual checks and/or have a separated CI.

IHMO we will need a separated CI anyway, especially to test Python changes on selected projects before merging a change. I’m thinking at changes known to be incompatible, like my https://github.com/python/cpython/pull/16959

No.

The Python release manager will have to check the status of selected projected at each beta, rc and final release.

I hope that project maintainers of selected projects will be more proactive to report issues upstream, but it’s not a PEP requirement.

Honestly, I don’t want to go so far in term of organization detail in the PEP. I prefer to let developers organize themselves :slight_smile: Most developers are kind enough to report issues directly to the proper project.

The strict minimum required by the PEP is that the release manager detects that pip is broken. That would an enhancement compared to the current blurry status quo.

In my experience, the issue is first reported to pip, and then it is reported to the downstream project. For example, I was somehow involved in reporting the deprecated collections ABC issue to html5lib and getting a html5lib release. You may notice that Python core developers already coordinated with pip to wait until pip is being fixed before removing collections ABC aliases :wink: Such coordination is not something new.

Let me try to explain that differently. The PEP doesn’t require pip developers to do more work. The PEP gives a trigger to pip developers to force core developers to revert a Python change if something goes wrong.

The PEP makes the assumption that all open source projects are understaffed and that CPython has a larger team than other selected projects. Even if it’s not written down, my expectation is that core developers will help to fix the broken selected projects, because these projects would block the final Python release.

My expectation is that if pip developers are simply too busy to fix an issue, CPython must be fixed to not bother pip developers. Or the CPython release is blocked until pip is fixed, if it’s better to let pip developers to fix the issue themselves. pip and CPython are not exclusive: they are Python core developers working on pip :wink:

But sometimes, it’s just a matter of weeks or even of days. That’s why the PEP gives the freedom to the release manager to ignore a broken project for a release.

IMHO it’s fine if Cython is broken for the first Python beta releases, but it’s not ok for the first rc release. And it must block a final release.

Let’s say that Cython is badly broken on Python 3.9 because of a last minute bugfix between two rc releases. Cython is fixed, but the Cython release manager is traveling. The Python release manager can try coordinate the Python and the Cython releases, or take in account that Cython is going to be released soon with a fix, and release Python anyway.

The coordination can be relaxed when a project is already fixed, but I would prefer to not release Python 3.9 final knowing that pip is badly broken, that no fix is available and no one is available to fix it. The PEP is a process to reduce the risk of ending in such situation, by making sure that bugs are detected before the final release.

If the PEP is modified to not require anything, IMHO the whole PEP becomes useless.

The DeprecationWarning is being ignored section of the PEP is a concrete example of such issue. Developers are “expected” to take DeprecationWarning warnings in account, but they don’t. Result? Python break frequently random projects when removing a feature, even if it was deprecated for 10 years.

The PEP must to be strict on some specific points. I expect you to help me to clarify what should be strict and what doesn’t have to be strict :slight_smile:

Technically, Python doc can be built even if Sphinx is not compatible with the latest version of Python: you can use an older Python version. No?

But Python without pip is barely usable.

I’m thinking aloud about the strict minimum set for selected projects. Adding any project means adding indirectly many other dependencies and Sphinx has many dependencies.

The PEP gives many new tasks to the release manager. I would prefer to not add confusion with “non blocking projects”.

I would prefer to stick the PEP to the minimum requirements for a Python release. Obviously, everything helping to get a better release can be done without a PEP and is welcomed (as already written in the PEP!) :slight_smile: You are free to add tons of CIs. You don’t need the Python core developers approval nor select projects maintainers approval :wink:

1 Like

I heard that such “world rebuild” is also done in CPAN before a Perl release.

But there’s a big difference between trying to do this and requiring we do this. So it’s great what you want to accomplish with the PEP, but that’s best-effort compared to requiring we do it. And if you mean to make this a goal and not a rule then this should either be an informational or process PEP or instead be in the devguide and not be a PEP.

1 Like

What about if this were describing a project that the PSF could offer a grant for?

Resourcing it is the big challenge - everyone should agree this is a good thing to achieve, but that doesn’t magically produce the ability to do it.

But perhaps if this PEP could become a statement of work, we’d be able to pay someone to do it.

@ewa.jodlowska - any thoughts?

1 Like

What part of “it” would it pay? Even if you pay someone to test all listed projects and file issues and PRs for the detected regressions, the PRs must still be reviewed and approved by a core developer of each of those projects.

As for this PEP, I agree with others: this is much to big a hammer to solve the problem at hand.

Also, as the developer of a third-party project (not listed in the PEP, though), what would help us most for testing a new Python release is to have conda-forge or Anaconda binaries for it in time. Right now 3.8 is available from neither.

We discussed this proposal at the Steering Council meeting this week, and our key conclusion was that we don’t think a PEP is the right vehicle for pursuing this idea.

There’s no “CPython Buildbots” PEP for example, there’s just a note in PEP 101 to check the status of the stable buildbots, and then assorted bits of documentation on managing the Buildbot fleet.

(I’ll put my own suggestions for how to proceed in a separate post from the SC level feedback)

Rather than pursuing this as a release process change, I think it makes more sense to pursue this as a CI enhancement, similar to the refleak hunting build, or the speed.python.org performance benchmarking.

That way the exact set of projects tested can be reasonably fluid (rather than being inflexibly locked down in an approved PEP) and chosen both to exercise various aspects of the standard library and to assess core ecosystem compatibility implications for a change.

If there are any PEP level changes at all, it would just be in the form of an additional note in PEP 101, and even that might not be needed if the standard practice is to file release blocker bug reports when the compatibility testing finds problems.

4 Likes

Sorry for jumping into this a bit late, I was told about this discussion a few days ago by @steve.dower.

I have a script on my home computer I run a few times a week that builds me a virtual enviroment with master-branch of {cpython, cython, numpy. scipy, matplotlib, pandas, ipython/jupyter,…} (basically the whole scientific python ecosystem through to my day-job code) and do most of my day-to-day development on that machine in that enviroment. It is a terrible brute-force bash script what has things like where I have various projects checkedout out hard-coded, but it does have the right order to build things baked in. I’m happy to share if people are interested.

I think it makes more sense to pursue this as a CI enhancement, similar to the refleak hunting build, or the speed.python.org performance benchmarking.

This makes a lot of sense to me. I think CI running master-branch python against latest stable releases of projects (installed from pypi!) would be a very good thing. Given that there will not be wheels yet, it may be worth using some sort of staged build.

I agree with @pf_moore’s concerns about this putting more burden on under-resourced projects, but we are going to see these failures one way or the other when Python is released so getting notified earlier would be better.

Running the master cpython - master project combination on CI is probably less valuable (but easier to get going as most of the CI services have a ‘nightly’ option that many of us use already).

…if Django, why not matplotlib, Flask…

As the lead Matplotlib developer I am :+1: on that suggestion!

1 Like

Would you mind to send me your script to vstinner@python.org? Thanks.

Responded to @vstinner via email.

Another thought that is wroth surfacing publicly is that it may be worth talking to the conda-forge folks. They have a correct machine-readable dependency graph, know where the stable sources are, and have build scripts for everything. I wonder if a pruned version of the graph they are using to run the py38 rebuild could be re-purposed for this?

1 Like