How to cleanly merge main into a PR branch?

I want to help an author of python/cpython#28245 by adding What’s New and documentation bits. To do this I want to open my PR against the author’s branch in their fork.

Since Doc/whatsnew/3.11.rst was modified in an area of my change, I need to actualize the author’s branch to main first. To achieve this, I took my cloned fork and ran:

$ git remote add pacien https://github.com/pacien/cpython.git
$ git fetch pacien
$ git checkout -b gh-28245 pacien/bpo-45141-mailcap-files-param
$ git merge origin/main
$ <created my own commit here>
$ git push origin gh-28245  # origin is git@github.com:arhadthedev/cpython.git
$ git log --oneline --graph
* 5c5d80577a (HEAD -> gh-28245, origin/gh-28245) Add user-visible notification on the change
*   8cd35c81b1 Merge remote-tracking branch 'origin/main' into gh-28245
|\
| * 23dcea5de7 (origin/main, origin/HEAD, main) bpo-40059: Fix installation of tomllib (GH-31784)
| * 02fbaf4887 bpo-46245: Add optional parameter dir_fd in shutil.rmtree() (GH-30365)
| * 5eb03b1b51 Fix 3.11 what's new formatting (GH-31763)
| * d1777515f9 bpo-45138: Expand traced SQL statements in `sqlite3` trace callback (GH-28240)
| * b33a1ae703 Docstring: replace pysqlite with sqlite3 (GH-31758)

However, a PR creation/diff page (Comparing pacien:bpo-45141-mailcap-files-param...arhadthedev:gh-28245 · pacien/cpython · GitHub)
shows that the merge went wrong and acceptance of my PR will display “arhadthedev added 1718 commits N minutes ago” in discussion of the original PR with invocation of many code owners.

What do I need to do to create a proper, transparent merge that GitHub shows as a single merge commit not reflected in “Files changed” tab?

Well, that’s exactly what your PR will do, relative to the author’s feature branch. The PR source is your branch containing both your new changes, and the merge to main bringing in all the other commits, while the target is the author’s feature branch, which doesn’t have those commits, not the main branch on the CPython upstream (which I presume you’ve just synced your origin/main branch from, though it is not shown above), which does.

Therefore, GitHub is correctly showing that your PR will add all of those commits to the author’s feature branch, the same as if you merged, say, a feature branch to a develop branch, and then merged develop to main—otherwise, the PR would show nothing.

Personally, for cases like these PRs don’t tend to be good tools for updating branches like this. In addition, I (and the various projects I work on) always rebase instead of merge to update an unmerged feature branch to the latest main branch, as the commit history is much cleaner and simpler to follow, rebase -i is much less troublesome, and it is much easier to fix Git mistakes or do cleanup later.

If the author is still active (and thus able to merge your PRs), I suggest just asking them to perform the rebase (or merge) themeslves, or giving you access to do so if they can’t. However, if you have to do this and can’t just make a new PR against upstream, you could at least make two separate PRs against the author’s branch: one with the merge and one with the new changes, so they can be reviewed independently. Also, using the commit view of the files pane allows reviewers to see what you changed in your one commit (though it is not as useful as being able to use the full files view).

I use git merge --no-ff main, which works well with GitHub PRs (contrary to rebases). In my experience, it is often easier to resolve conflicts using merge than with a rebase of a long lived branch.

If you try to open your PR against main in the CPython repo (instead of against the original author’s fork), you’ll see that it all looks ok; GitHub shows three commits:

  1. the original commit from GH-28245
  2. sync with main (your merge)
  3. your change
2 Likes

Also, since we always merge to main using squash merges, it doesn’t matter how the history looks in the PR. Whatever it takes to make the review smooth (which generally means don’t ever force push) is best, and then (the core dev should) include enough info in the commit message that the PR history isn’t needed.

3 Likes

Please read the last chapter of the devguide. The last I checked, It recomments creating an alias ‘pr’ that creates a local branch from the PR # in a way that works when you push the branch as the PR directs. It also recommends the git merge that you used. I have had no problems when I followed the directions.

2 Likes

FWIW, @erlendaasland and I had a hearty discussion about this in python/cpython#96136, which in particular focused on mitigating some of the UX and workflow issues with merging (vs. rebasing). Erlend shared a number of helpful tips and insights that might be of interest to the newer contributors here. It also came up that you can simply do git merge main without the --no-ff argument, as it --no-ff only makes a difference if the feature branch has no new commits over main, in which case it could not have been made as a PR in the first place.

I’m guessing you mean this section of the devguide Git tutorial? Just FYI, several months ago we organized it into meaningful categories rather than a the previous long, flat, somewhat arbitrarily-ordered list of documents.

To note, also relevant here, it says

Please do not try to solve this by creating a pull request from python:main to <username>:main as the authors of the patches will get notified unnecessarily.

The primary recommendation, at least currently, is to use gh pr checkout or hub pr checkout (of which I’ve used the latter for some years now when reviewing PRs, especially on cpython where I still have to check out every branch in order to view updated docs in rendered form, since it (at least up until now) has no rendered docs previews in CI. It also mentions the pr alias as a fallback for users who don’t have or prefer not to use these tools, which I likewise used (and still have configured) a very similar one years ago before I had hub (though mine featured an optional remote name param and also had a pr-clean command to get rid of old PR branches, something I still miss today.

Of course, in the general case presented there, there’s not nessearily a need to use a merge (as opposed to just a rebase) if the branch hasn’t been already submitted and reviewed as a non-draft pull request to the upstream CPython repository—of course, that is the case here, and additionally given it is another author’s branch, it is hard to argue with merge being the right tool here even if it were not repo policy.

2 Likes

Yeah, you may use all kinds of git shenanigans in your own clone before submitting the PR. I rebase often, but I do so before submitting a PR. Once a PR is up on python/cpython for review, it is a public thing and a lot of people have their eyes on it, so the PR author should keep that in mind for every action taken post-submit.

3 Likes

Let me also add the “show-merges-only” tip, for the readers convenience:

$ git switch <cool-feature-branch>
$ git log --oneline --first-parent  # commit log like GitHub
$ git log --oneline --merges  # list merge commits only

Also, if you’re resurrecting an old branch (for example someone else’s stale feature branch), you’ll be better off syncing with main with a merge instead of a rebase: for a merge operation, you’ll only have to resolve conflicts once; for a rebase (across hundreds of commits), you’ll want to learn git rerere, unless you really like resolving the same changes over and over again :wink:

2 Likes

That was also something I was going to mention, though upon further thought the number of repeated conflict resolutions depends on the number of commits in the feature branch rather than the main branch (though increased stateleness may indirect contribute to more conflicts to resolve across the feature branch commits).

2 Likes

Long-lived feature branches are a burden to deal with, especially if
they’ve been abandoned for a while.

I’ve been known to do a “test run” - make a new branch off the old one,
then in that one squash the commits so I’m dealing with just a single
one, and try it to get a picture of how hard that will be to rebase.
Dunno if that works for anyone else, but it’s occasionally helped me
(mainly to decide: this isn’t going to be worth it, maybe it’s batter to
“start clean”).

While we’re sharing git tricks:

git merge upstream/main
# resolve merge conflicts
git reset --soft upstream/main
git commit -m "All the changes are in this single commit now!"

(Typed off the top of my head, so use at your own risk, but I do it all the time without looking it up so this must be close enough :wink: )

3 Likes

You can also get the same end result with:

git reset --soft $(git merge-base main HEAD)
git commit -m "All the changes are in this single commit now!"
git rebase upstream/main
# Resolve merge conflicts

or, to use the squashed original commit messages,

git rebase -i $(git merge-base main HEAD)
# Replace all pick with squash in your editor
git rebase upstream/main
# Resolve merge conflicts
3 Likes