Pre-PEP: Providing a CPython mirror

Hello,
I would like to propose providing a mirror of the CPython repository. You can find the draft here:

1 Like

I personally think it’s a great idea. Having an official mirror that serves as a backup makes sense. It also would make it easier to migrate should the need arise.

I’m not sure I understand the goals of this proposal.

We already have backups of the CPython repo, importantly including the issue and pull request data, which git clone --mirror won’t include.

Adding another hosting site just for the sake of it seems confusing, and means contributors and core developers need to check multiple places. The benefits of moving issues to GitHub were that we have one centralised place for development. Fracturing that by adding other places to eg submit PRs would be unhelpful.

If the proposal is simply to have a mirror of the data for those that don’t want to use GitHub, that can be done very easily and doesn’t, in my view, require a PEP.

Overall I don’t think I understand the ideas behind this PEP, and so currently would be -1, but hopefully the authors can elaborate on the context and rationale behind it.

A

9 Likes

As the PEP states:

The mirror is not intended to replace the host for contributions (issues, pull requests, CI), but to provide an additional read-only distribution channel for the Git data (commits, branches, tags etc.) because having multiple contribution platforms would introduce excessive complexity and asynchron

having two platforms for contribution is unrealistic and is not proposed.

That is part (but not all) of the proposal, yes.

Some contributors prefer using open-source platforms to develop open source projects. Since GitHub is not open source, a mirror would allow us to address this concern. Mirrors improve accessibility for contributors in regions where GitHub may be blocked [1] or slow.

Doesn’t the read-only nature of the mirror more or less block this motivation? It lets a contributor clone the repo without using GitHub (although cloning from a public GitHub repo doesn’t even require a GitHub account so it’s a very marginal gain) but it won’t let them push their changes back without a going through a pull request (and most likely an accompanying issue) on GitHub.

4 Likes

I also read this section that Brenainn quotes, which is where my comments about needing to check multiuple places come from.

If the proposed mirror is to be read only (i.e. not used for core development), then what are the benefits of having it be blessed by the project? There are no restrictions to any individual or group setting up a mirror, and indeed many already do.

Secondly, if the mirror won’t contain all the data used to inform decisions (namely the history of PRs & issues), what benefits does it have over the more standard backups we currently perform?

A

1 Like

It confirms it’s validity, I’m very careful when I clone and compile code from random repositories, since there’s no guarantee it isn’t harmful. A malicious actor could create a “mirror” and inject their commits.

If someone manages to get malicious code onto the CPython source tree, I think we have bigger problems at that point. But anyway, how would a mirror help prevent that? An evil commit would be reflected in the mirror as well.

I think you misunderstood me, apologies if I was not clear.

I meant that the malicious commits would be in the unofficial mirror, not the actual source.

Oh, sorry, I did misunderstand.

If you’re worried about the mirror being bad, just verify it against the MD5 checksum that we provide in releases. Otherwise, if you’re unable or against using GitHub for contribution, why clone CPython at all? Our unreleased branches sometimes contain changes that break the buildbots, and users probably don’t want those commits. We have a release process for a reason!

1 Like

Indeed, like the larger backups issue, can this PEP be a core-workflow issue instead?

2 Likes

Here are instructions on how to set up such a mirror with Gitlab: Repository mirroring | GitLab Docs

You may need a paid plan to have this work in “pull” mode.