Protocol for working with embedded bytecode changes

Way long time ago (1.5.2 timeframe), I worked on a register-based virtual machine for Python. Since then, at least Victor Stinner has made another (nearly complete) run at it. Since retirement, I find myself with more free time to mess around with Python. I went ahead and decided to try again.

The biggest change which impacts my relationship to CPython master is that I have squished out all the gaps in the the existing stack-based opcodes to make room for me to simply tack on my register instructions. So far so good.

In the past day or two, bpo-38091 landed on master, which involved a change to Lib/importlib/_bootstrap.py. That, in turn, resulted in changes to Python/importlib.h. When I merged master to my branch this morning, that generated a conflict.

No problem. I reran make regen-all and all was good. My question is… Going forward, will I have to go through this dance every time I merge from master to my register branch, or is git somehow smart enough to not register another conflict unless Python/importlib.h (or other embedded bytecode) changes?

2 Likes

It will probably depend on how you handle the merge from master (e.g. rebasing or something). But changes to importlib are rather infrequent so I wouldn’t expect it to come up very often.

1 Like

Oh, what a worthy retirement project! Do you need help? If so, maybe there’s a young coder who’s interested in helping you out and whom you wouldn’t mind mentoring.

(Also, what Brett is saying is that you would only see another conflict if importlib.h is changed again in a way that conflicts with your HEAD. So there’s not too much to worry here.)

If you sometimes merge master in, and keep it merged, running make regen-all every time is probably your best bet.

That won’t work if you don’t keep master merged in, but just sometimes attempt a merge to see how that would go. Or if you regularly rebase/merge onto different base branches. Git has bit of hidden magic called rerere for workflows like that: https://www.git-scm.com/book/en/v2/Git-Tools-Rerere
(If you take the time to go through the docs, you can just keep it turned on: it’s sometimes helpful and doesn’t really do harm. It’s just sometimes surprising to see merge conflicts resolve themselves.)

2 Likes

If you sometimes merge master in, and keep it merged, running make regen-all every time is probably your best bet.

Thanks, yes, I have been merging from master every day or two at least
and learned the hard way a couple weeks ago and all the embedded
bytecode (I don’t recall that from the 1.5 days!), then also “make
regen-all”.

Skip

Oh, what a worthy retirement project! Do you need help? If so, maybe there’s a young coder who’s interested in helping you out and whom you wouldn’t mind mentoring.

I might be the one needing mentoring. The compiler/virtual machine
internals have changed a lot over the years. :slight_smile: But yes, I would
welcome help.

(Also, what Brett is saying is that you would only see another conflict if importlib.h is changed again in a way that conflicts with your HEAD. So there’s not too much to worry here.)

Good to know. (Not sure why I failed to get an email regarding Brett’s
response. I thought yours was the first.)

Skip

You don’t have to merge every day or two, though :wink:

Understood. Still, if there are conflicts they are much easier to deal with when done regularly, at least for me. I have a script which resyncs a named branch (only tracking 3.8 and master at this point), then pushes to my fork on GitHub. I then merge master to my working branch and push that. Takes a couple minutes at most, and most of what I need is recorded in my bash history.

That’s new as of Python 3.3 when we switched the import implementation to importlib and we had to come up with a bootstrap solution. The specific make regen-all command is much newer (I think 3.7?) when it made sense to have a single command to regen everything in the source tree that needed such generation.

1 Like