When modifying C code, you frequently need to resolve merge conflicts in generated code if it takes a while for a PR to be reviewed, especially the checksums. For what it’s worth, it should be possible to fix these conflicts automatically; a make rule would go a long way (currently doesn’t work when a file contains conflict markers). Do these files really need to be in version control?
Not having worked with generated code in cpython, I don’t have specific advice, but I would generally commit raw changes and generated code in separate commits. If rebasing, I would just drop the code generation commits and rerun after rebase. I believe cpython discourages rebasing, so I would probably just revert the code generation commits, merge, and rerun. If you forget to revert first, you can still just git checkout upstream/main <conflicting files>
, commit the merge, and then rerun.
Are you talking about .c.h
files generated by Argument Clinic?
Yes, does it speed up something be being there?
It speeds up build? We also can’t build the bootstrap interpreter without them being there, so it would require users to have a working Python install of sufficient version in order to build. That’s not a current requirement, and pretty sure it’s one we don’t want to introduce.
Alright, can we then add a make rule to fix the conflicts then or/and even a pre-commit hook?
Neither of those would save you from having to manually merge a PR branch due to conflicts. All they do is add friction for most contributors who are not modifying the builtin API surface.
Adding them to make regen
would probably be easier for some users than clinic.py -f ...
. It really just depends on which approach you are more familiar with, I guess.
I just got a merge conflict on https://github.com/python/cpython/pull/119849 (waiting for your review ), so I can give a demo. This is what make regen-all
currently does right after merging the branch with main:
gcc -fno-strict-overflow -Wsign-compare -Wunreachable-code -DNDEBUG -g -O3 -Wall -std=c11 -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wstrict-prototypes -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal -I./Include/internal/mimalloc -I. -I./Include -DPy_BUILD_CORE_BUILTIN -c ./Modules/posixmodule.c -o Modules/posixmodule.o
In file included from ./Modules/posixmodule.c:15944:
./Modules/clinic/posixmodule.c.h:13087:1: error: version control conflict marker in file
<<<<<<< HEAD
^
1 error generated.
make: *** [Modules/posixmodule.o] Error 1
Then I try again after accepting the incoming changes (which segfaulted on a different PR):
python3.13 ./Tools/clinic/clinic.py --make --exclude Lib/test/clinic.test.c --srcdir .
Error in file 'Modules/posixmodule.c' on line 13087:
Checksum mismatch! Expected '9756767bdbdabe94', computed '1da32369b7d9fd4b'. Suggested fix: remove all generated code including the end marker, or use the '-f' option.
make: *** [clinic] Error 1
Finally I give up and run clinic manually and make regen-all
one more time just to be sure:
$ Tools/clinic/clinic.py -f Modules/posixmodule.c
$ make regen-all
I would already be happy if I could just run make regen-all
once.
I guess there’s a bug where it tries to build something before regenerating it?
I guess another bug where regen-all
ought to pass -f
to clinic.
I’m not a Makefile maintainer, so hopefully someone who is sees this and decides these are worth fixing. It’s possible that the Windows build.bat --regen
also behaves the same, though (probably doesn’t have the first issue).
The issue here is that make
defaults to running regen with the built Python. It should ideally use another, known-working Python, but that would require finding it somehow (and assuming that it exists at all).
So, we can’t do much better than let you specify it manually:
PYTHON_FOR_REGEN=/usr/bin/python make regen-all
Another issue is that conflict markers sometimes leave duplicate start generated code markers, so for partially generated files, it’s good to resolve conflicts in the hand-written part before running regen-all
.
But it already requires a recent Python to be installed:
$ make regen-all
python3 ./Tools/cases_generator/opcode_id_generator.py \
-o ./Include/opcode_ids.h.new ./Python/bytecodes.c
Traceback (most recent call last):
File "/Users/wannes/Documents/GitHub/cpython/main/./Tools/cases_generator/opcode_id_generator.py", line 8, in <module>
from analyzer import (
File "/Users/wannes/Documents/GitHub/cpython/main/Tools/cases_generator/analyzer.py", line 951
match part:
^
SyntaxError: invalid syntax
make: *** [regen-opcode-ids] Error 1
And it doesn’t actually solve the issue:
$ PYTHON_FOR_REGEN=python make regen-all
gcc -fno-strict-overflow -Wsign-compare -Wunreachable-code -DNDEBUG -g -O3 -Wall -std=c11 -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wstrict-prototypes -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal -I./Include/internal/mimalloc -I. -I./Include -DPy_BUILD_CORE_BUILTIN -c ./Modules/posixmodule.c -o Modules/posixmodule.o
In file included from ./Modules/posixmodule.c:15944:
./Modules/clinic/posixmodule.c.h:13087:1: error: version control conflict marker in file
<<<<<<< HEAD
^
1 error generated.
make: *** [Modules/posixmodule.o] Error 1
Is the order in which the code is generated maybe incorrect?
Yes. You do need a working recent Python to regenerate the files. I don’t see a way around that.
Could you try with the full path to the system python?
Still doesn’t work.
That is a reasonable ask.