Disappearing macOS packages on Python.org

Hello CPython devs!

I have a question about CPython release artifacts. I maintain cibuildwheel, and part of that project downloads and installs Python distributions from Python.org. For reproducibility, we hardcode the URLs that we’re downloading CPython from.

Earlier today we got a few messages from users that their builds were failing, because cibuildwheel couldn’t find a CPython download. The macOS 3.10.0 release used to be available at:

https://www.python.org/ftp/python/3.10.0/python-3.10.0-macos11.pkg

But that gives a 404 now. According to the downloads API, and the download button on Python.org, it seems that the correct URL is now-

https://www.python.org/ftp/python/3.10.0/python-3.10.0post2-macos11.pkg

I was wondering if this was a deliberate change? I don’t think I’ve seen a release file disappear before (though I have certainly yanked a package from PyPI, I’ve been there!). I suppose I had hoped that release artefacts wouldn’t disappear like this - or maybe this is a very unusual case?

cc @pablogsal

1 Like

I concur. We’ve been seeing this in the PyArrow CI as well (example).

It would be very nice if download URLs for Python binaries could remain regular and predictable.

2 Likes

I’m sorry for the disruptions here. This was a result of providing an updated 3.10.0 macOS installer to address a critical problem for IDLE and tkinter users who have updated to macOS 12 Monterey; see the discussion here for more details. An announcement about it (other than in the bug issue) will be made with the 3.9.8 and 3.11.0a2 announcement which was supposed to happen today but has been delayed a day. I take responsibility for persuading Pablo and the rest of the release team that doing a stealth update of the installer file was a bad idea based on my apparently mistaken idea that few, if anyone, make assumptions about the format of the URL We really don’t make any guarantees about that format, it does change now and then, and this was an exceptional situation. Despite my misgivings about stealth updates, I have now made the updated installer also available under the original download URL. Again, sorry for the disruption though I’m glad to see people are depending on what we provide.

4 Likes

Hi Ned, thank you for getting back to me, and for maintaining these macOS binaries!

Absolutely, I understand. Pleased to hear that this was an unusual situation.

By the way, we don’t make any assumptions about the format of the URL in cibuildwheel, we get the URL from the downloads API. But we do bake a specific download URL into each release (for reproducibility), so if a previously working URL goes 404, that gives us a problem.

Thanks @joerick for the ping!

As @nad was mentioning, this was unfortunately a very unique situation so this won’t happen on a regular basis, but from now on we will make sure that we don’t invalidate URLs going forward if we can (and if we do we will send very noisy announcements).

Thanks for your patience and understanding :slight_smile:

1 Like

In my opinion the hash of a URL to an installer should never change. Doesn’t cibuildwheel validate downloads against an off-site checksum?

One way to approach overriding packages would be for the PSF to maintain a manifest of the releases like in this CPython file from github actions or this PyPy one maintained by the PyPy team (me). The PyPy one has a latest_pypy field to indicate that, if the user specifies a python version of which there can be many choices, this is the one to use. The checksums are available at a separate page (to reduce the risk of a bad actor changing links), and should not change once a package is released.

Ideally the download should be easily done on a bash or cmd.exe command line using just the desired version number. If that involves parsing an intermediate file, hopefully some snippets can be published that people can just copy and paste.

I’ve been intending to add this, the fact that the download can just change like this and it still works is worrisome, IMO. We check hashes for PyPI, just not the Python downloads (yet). I’d also like a manifest solution, something where 3.10.0 would resolve to 3.10.0post2, but where you could also pin 3.10.0 and use a hash, as otherwise the download is not secure.

This case it’s okay (from cibuildwheel’s side) since we don’t have a hash yet and if we add one, it would be pointing at post2 anyway, but in the future, it could be more problematic. Hopefully it will be a rare occurrence!

Here is a start to get the darwin 3.8.12 filename from the PyPy versions.json I linked to before, and to get the checksum from the webpage. It uses wget and jq. Note the use of latest_pypy to pick between various 3.8.12 downloads (there is only one, but conceivably this could be extended to regex 3.8.* and choose the latest:

filename=$(wget -qO- https://downloads.python.org/pypy/versions.json \
   | jq -r '.[] | select(.python_version == "3.8.12" and .latest_pypy) \
   | .files[] | select(.platform == "darwin") | .filename')
checksum = $(wget -qO- https://pypy.org/checksums.html \
    | grep $filename | cut -f1 -d' ')

Thanks. There should be an official method published somewhere in the downloads area, though. Also, it’s better if it doesn’t bring non-standard dependencies (is jq available by default on macOS or other plaforms, for example?).

Another possibility would be to simply publish “latest” URLs that would redirect to the actual download. For example a 3.10-latest download.