Merge queue helps increase velocity in software delivery by automating pull request merges into your busiest branches.
Before merge queue, developers were often required to update their pull request branches prior to merging to ensure their changes wouldn’t break the main branch when merged. Each update resulted in a fresh round of continuous integration (CI) checks that would have to finish before the developer could attempt to merge. If another pull request got merged, every developer would have to go through the process again.
Merge queue automates this process by ensuring each pull request queued for merging is built with the pull requests ahead of it in the queue.
It looks like merge queues are enabled per branch. Would we only use it for main, or also for the release branches? I’m not sure we have a lot of contention for most release branches, but if we enable it for main we probably also want to enable it for the 3.12 branch, at least up to rc1.
Would this end up using more CI resources because PRs are tested again before being merged? If so, I’m not sure it’s worth the cost to enable it, since it’s not that often that we run into issues due to merge races.
The point of merge queue on a sporadic activity project like CPython as I see it is that we actually ensure that CI runs on each PR in the state it’ll be in post-merge, instead of today’s assumption of “no conflicts to merge head, so we’ll just apply it and call it already tested and merged even though it was technically never tested in a whole tree matching what the result of the merge will be”.
I think enabling it in main is worthwhile. Easy enough to turn off if we find workflow issues with it. It’d be good to have some experience with before the PyCon sprints.
For release branches, I suggest RMs also go ahead do it. It’s a no-op there in many cases as the number of simultaneous backports tends to be <= 1. But when not, it should be added CI assurance.
We’ve never blocked merges on CI testing being in a fully-merged-with-branch state in the past. Likely specifically because that becomes the nightmare described in the feature announcement. With merge queues, we could actually choose to do that in the future without as massive disruption as that concept brings without a queue.
Wouldn’t we actually save on CI if we didn’t run CI post-merge on main? Since the merge queue is meant to verify that what main is about to become is good, then why test again post-merge when it’s going to be the same code as what the merge queue just checked?
So we would have CI on the individual PR and then once more in the merge queue, and that’s it.
It should be a net ≈neutral, since CI runs once per PR either way. But the benefit would be that it runs prior to merge rather than ex-post-facto, so any problems can be addressed pre-merge and resubmitted rather than fixed/reverted after the fact.
This could also be useful if, e.g., it was possible to trigger the buildbots at this step, such that buildbot failures would be more visible and wouldn’t require going back and fixing things later. I imagine buildbot failures are more common than the relatively rare merge races (though, perhaps too noisy unless merging was only gated on certain buildbots).
While the OpenStack community never used GitHub, we implemented the
equivalent of GitHub’s “merge queues” in our CI system over a decade
ago out of necessity for dealing with the sheer volume of
development activity we amassed, and we still rely on it for merging
hundreds of changes (our review platform’s equivalent of pull
requests) every day.
There’s quite a sense of freedom, as a core maintainer on projects,
that I can go through and approve any changes which look right to
me, with the confidence that if they won’t pass CI jobs in the
context of everything that’s been approved ahead of them then those
specific changes will get kicked back for further revision and
review. It’s not uncommon to approve changes before they’ve even
been initially tested, because we know they’ll be tested in the
process of merging anyway (potentially saving even more job
resources in such cases). Trying to contribute to projects which
don’t test that way has become painful for me, since it feel like
going back to pre-2010 development practices.
Merge-time test queues offer a substantial reduction in cognitive
load for reviewers, while also putting the responsibility squarely
on the shoulders of people proposing changes to solve any issues
that may arise when trying to get them merged alongside other
potentially incompatible changes, rather than finding out some
combination of changes (which should have been tested together but
weren’t) ended up breaking the development branch, where someone who
probably isn’t responsible for proposing or approving those
logically-conflicting changes ends up scrambling to sort out the
resultant mess instead.
I don’t think so, merge queue is only available when you check the “Require merge queue” box in the branch protections.
Before CPython enables merge queue requirement, it may need to update some of the bots behind its required checks since the way merge queue works is a bit untypical - merge queue triggers CI by creating a branch:
This means that any required check that only runs on PR events (i.e. backport PR title, perhaps also cla bot) or only runs on push to specific branches would not get triggered by it without some alterations.
Especially if it creates a branch on the upstream repo, besides the nuisance of potentially polluting the upstream repo with ephemeral feature branches, we would need to carefully test if it and our Triager permission workaround implemented in python/core-workflow#460 would interfere with each other, or the combination of such with other bots/tools. We’d also need to ensure that the protections remain fully effective, and any additional rights triagers may have with merge queues are not a serious concern.
Okay, thanks. Assuming that’s indeed the case, that should address most of the concern over upstream feature branch pollution and ensuring our branch protections remain effective as far as push access is concerned. We would still need to check the permissions from the GitHub UI side, to ensure that triagers still have the appropriate permissions with the feature enabled, and it doesn’t expose any new loopholes, as well as confirm the impact on bots/tools.
Just to note, GitHub’s Automerge feature allows this (except that rare-ish cases where CI passes on the branch but fails after merging get reported post-facto instead of pre-facto), and (AFAIK) so does CPython’s bespoke legacy automerge tag. And of course, it doesn’t remove the need for an appropriately diligent human review, which at least for CPython, in all but the best case typically happens around or after the CI runs finish anyway.
That being said, if its practicable to trigger the full Tier 1/2 buildbot fleet once pre-merge instead of post-merge, and abort the merge with a notification if it doesn’t pass instead of just failing post-facto and requiring the responsible core dev to go revert, fix or otherwise put out the fire after the fact, seems like a pretty compelling potential workflow and confidence advantage on its own, with PRs always being tested with the latest integrated changes prior to merge as an added bonus.