There is a proposal to add a mechanism for Python distributors to add new site install schemes. Theses install schemes will be added to sysconfig and will be loaded in the site module initialization.
We currently patch in some logic as well. As long as the file is executed as Python on runtime, we should be fine.
Long term, it would also make sense to be able to provide sysonfig._get_preferred_schemes via this file. Currently, the intention is to patch it in sysconfig instead – and if we need to patch that module, we would add our schemes in that patch as well.
Technically, all we need is --with-vendor-sysconfig-append=sysconfig-fedora.py and the build system would take that file and append its content to the end of sysconfig during installation. I think (but have not verified it) that if you override the values in the dict at the end of the file, it should work. The benefit of this is that we could also override functions and there is no need to carefully specify the behavior for EXTRA_SITE_INSTALL_SCHEMES.
I quite like providing first-class support to CPython redistributors for patching install schemes, which has also been required. It’s kind of unfortunately this was only proposed after _get_preferred_schemes() went in for 3.10; it would be nice if distros could use this vendor config to supply the value from day one, but there’s no turing back time.
Exactly how this should be implemented is pretty much out of my expertise, but I do feel simply appending to the end of the file could be too easy to break. Would it be more robust to general the sysconfig module from a template populated from a config (e.g. INI) file instead? Is there anything else in the stdlib that does this?
Can you describe why that is needed? This approach solves the issue Fedora is trying to mitigate in a different way, but solves it nonetheless.
The way this is supposed to fix it is that distributors, such as Fedora, would add a new install scheme, say fedora, that will use fedora-packages instead of site-packages. Then, when building packages, Python modules would be installed there instead of the default scheme (which uses site-packages).
Python installers such as pip, will use the default scheme and so, it will never mess with the distro packages.
I remember @pf_moore said it would make sense for pip to gain a --install-scheme argument, which would work for Fedora’s package install use-case.
I believe this to be better than Fedora’s approach because it does not hijack another perfectly valid Python installation (prefix=/usr/local).
The issue with allowing distributors to directly inject code into sysconfig is that then they can change expected behaviors, which is not good, and brings us back to the current issue – distributors patching the Python installation in ways that cause expectations to be broken, and with wildly different approaches. If that’s the case, I’d rather just having them patching sysconfig directly, without explicit upstream support.
What I wanted to achieve with the config was to allow distributors to do what they need in a way that is not intrusive and will not break any behaviors/expectations. This is a big problem right now, and I think defining a set of rules that work for everyone, and that everyone should play by, is the best way to resolve this.
The problem we’d have with --install-scheme=fedora or something like that is that albeit we have a big control over RPM packages that are created in a standardized way (e.g. they use RPM macros we control and can adapt), many packages use custom build mechanisms, such as calling $PYTHON setup.py install --root $ROOT from a Makefile generated by CMake or autotools.
In the future, ideally for RPM packages that use our code that calls PEP 517 hooks, we will like to explicitly select the scheme when installing, but unfortunately, we are not there yet.
Please, don’t feel blocked by Fedora to ship a mechanism we could use directly in Python 3.10. We patch distutils now and we can keep patching sysconfig instead for a while. Our primary goal for 3.10 was to be able to move our patch from distutils to sysconfig.
Anyway, even with my Fedora off, here’s what I think needs to be possible to make this useful:
supply custom install schemes via a configure option and a config file (or a Python file)
If appending code to sysconfig is not desired and seems fragile, another way of doing it is to provide a custom module (e.g. sysconfig/vendor.py) that has a defined API (e.g. it has a public get_preferred_schemes callable and INSTALL_SCHEMAS mapping) and if it is present, sysconfig imports from it and updates the dictionary/function. Distributors would supply their own implementation fo this module via a configure option.
setuptools/distutils should also gain such a mechanism btw. My approach for these issues would be a simple mv "$ROOT"/usr/lib/python3.9/{site,fedora}-packages, as schemes don’t really have any impact other than the install path, so that would be a reasonable thing to do.
I agree. Maybe it would be best to have an EXTRA_INSTALL_SCHEMES dictionary, and make EXTRA_SITE_INSTALL_SCHEMES a list.
I was already planning to open a PR for get_preferred_schemes.
The way I have implemented this is indeed as a custom module, _vendor_config, it is purposely not public.
I have added get_preferred_schemes to the config, this means you should be able to put your RPM_BUILD_ROOT logic there. This should allow you to make it so that you use the vendor scheme during package building and posix_prefix the rest of the time.
I’ve only skimmed this thread, but I don’t quite see the need for this in Nixpkgs. In Nixpkgs we copy a sysconfig.py into the site-packages folder. When cross-compiling we can then add the host Python’s version to PYTHONPATH, its the closest we get to this. One issue I can imagine we have now, is when using Nix without sandboxing. In that case it is possible pip would install in the store, which it should not. I don’t know if/how this would help.
We actually symlink it into the site-packages folder, and this is really only needed with cross-compilation. I don’t know the details for when it exactly was needed, but we have the following comment in our expression:
# The CPython interpreter contains a _sysconfigdata_<platform specific suffix>
# module that is imported by the sysconfig and distutils.sysconfig modules.
# The sysconfigdata module is generated at build time and contains settings
# required for building Python extension modules, such as include paths and
# other compiler flags. By default, the sysconfigdata module is loaded from
# the currently running interpreter (ie. the build platform interpreter), but
# when cross-compiling we want to load it from the host platform interpreter.
# This can be done using the _PYTHON_SYSCONFIGDATA_NAME environment variable.
# The _PYTHON_HOST_PLATFORM variable also needs to be set to get the correct
# platform suffix on extension modules. The correct values for these variables
# are not documented, and must be derived from the configure script (see links
# below).