Build the stencils on a platform with the required LLVM version (say, on Fedora).
Save the stencils (per architecture, per build type (optimized/debug)).
Use the stencils from above on a different platform without the required LLVM version (say, on RHEL) by dropping them to the out of tree build directory.
I was able to do this with hacks. Notably:
The make dependency hierarchy insisted on rebuilding jit_stencils.h. I was able to hack around that by invoking make with JIT_DEPS= (empty) (including make install).
The PGO task deletes the jit_stencils.h file mid build. I was able to hack around that by sed -i '/rm -f jit_stencils.h/d' Makefile after running ./configure.
I was wondering if eliminating the need for those hacks could be a supported use case. E.g. something like --with-jit-stencil=my-file.h which would copy it instead of trying to rebuild it?
This should be pretty easy to add to the prototype but I’m curious about what your use case is? More specifically, I’m interested in why you’d want to force using the stencils.
In RHEL, we keep one Python version for a very long time.
On the other hand, LLVM keeps getting rebased.
As a conclusion, we cannot keep a build time requirement on a specific LLVM version (for ~10 years).
Similarly, sometimes in Fedora, a new LLVM version is not available in the oldest supported Fedora (but we want to add a new alpha version of Python).
When that Python becomes older, the old LLVM version might no longer be available on the newest supported version of Fedora.
For now, as the is JIT experimental, we keep building it only in (new enough) Fedora with the required LLVm version. But we would like to keep building it for other platforms and using the prebuilt stencils seemed like a good solution.
I want to force using the stencils because if I don’t, the build machinery tries to rebuild them (and fails, without the necessary LLVM version).
In fact, I think your summary was good enough. Minor changes:
We want to build the JIT without the clang/LLVM build requirement on (too old or too new) Fedora/RHEL systems. One way to do that would be to prebuild the stencils at the distributor level (say, a specific Fedora version), then pass it around for various other Fedora/RHEL versions to use when we build Python there (also at the distributor level).
Thanks for the details. One concern I have is around long-term correctness. If you’re using pre-generated stencils with an older Python version and we backport a change that changes how some bytecode is specialized, the stencils could become subtly stale. Since the build and tests might not catch this, it could silently introduce bugs or crashes.
In my reference implementation, I have CI rerun to build and assert that the stencils haven’t changed (and of course, we are mandating a specific LLVM version at this point) to avoid this issue.
I’d be interested in how you plan on catching this kind of drift in your workflow.
Makes sense. Coincidentally, during the PyCon US sprint, Brandt recently added a --output-dir flag in the build script to fix a Windows CI issue. I haven’t tested this myself, but I don’t think you need to have LLVM installed to check if the stencils need to be regenerated. If you place your stencils in this directory and pass in this new flag, this might do what you’re after.
Want to give that a try and report back? If this doesn’t work for some reason, I’d be happy to explore options for either modifying the supported flags or introducing a new flag, as you mentioned above.
If you point --output-dir to the directory containing your pre-generated stencils and pass that into the script, it should let us confirm whether this works as expected.
That said, we’d still need to make this configurable via configure to support this workflow cleanly, since --output-dir is currently hardcoded.