Linux distro patches to `sysconfig` are changing `pip install --prefix` outside virtual environments

Normally, when you do a pip install --prefix /tmp/foo sampleproject, you get a file system tree that looks like:

❯ tree -L 2 -F /tmp/foo 
/tmp/foo/
├── bin/
│   └── sample*
├── lib/
│   └── python3.9/
└── my_data/
    └── data_file

(the * is noting that the file is executable)

However, when using the Linux-distro-provided Python, outside of a virtual environment, we get:

❯ tree -L 2 -F /tmp/foo
/tmp/foo/
└── local/
    ├── bin/
    ├── lib/
    └── my_data/

The relevant lines that cause this behaviour change in…

The motivation for these patches from the distributions is straightforward: They want pip install foo to go into /usr/local/... while keeping the Python executable at /usr/bin/python except when they’re building their distribution’s own packages, which should go into /usr/....

The problem here is that their patches currently also affect pip install --prefix (and installer’s prefix argument) since their patch is at the sysconfig level. This means that we have an inconsistent behaviour depending on whether the user is building within a virtualenv / using “vanilla” Python vs using their distro-provided Python.

I initially believed that this was limited to a single major distro but now I’ve seen that multiple distros are making the same mistake, and thinking about this a bit more, it’s not exactly a straightforward thing to solve. Anyone got ideas for how we might be able to avoid this, or resolve this situation? :slight_smile:

1 Like

Looks like there’s one suggestion for how to handle this with Debian, that we could use purely on pip’s end, but that suggestion does not work with Fedora’s patch, which modifies the posix_prefix directly:

@hroncok is actively working on solving this in Fedora

2 Likes

I proposed this solution to the problem in Fedora:

You can see the individual commits to follow my thinking :thinking:

Unfortunately, an earlier version of the patch broke virtualenv :bomb:
After I failed to explain what I need to virtualenv folks in When setting prefixes to empty strings, check the key names in addition to value by hroncok · Pull Request #2401 · pypa/virtualenv · GitHub one of the comments gave me an idea how to solve that :bulb:

I am now fairly sure the proposed solution actually works. It’s once again more hackish and less systematic, but it’s also once again less dangerous. Practicality beats purity :tophat:

I’ve been fighting Fedora CI infrastructure problems all day which has unfortunately delayed progress :construction:

The change is also awaiting a review from my fellow Fedora Python maintainers :pray:

However, I really hope we could have this fixed in the development Fedora versions this week, with a backport to Fedora 36 by the end of the month :rainbow: :unicorn:

3 Likes

You got it right here, I’ve been wrapping my head around this for months now.

2 Likes

Unfortunately, I don’t have any insights here. But I do have some comments.

As a consumer of sysconfig (i.e., I expect to query it to get information on how to install things, and I have no experience of configuring schemes from the distro side of things) I find the existing scheme names very confusing. For example, why are there posix_ and nt_ versions of some schemes, but not all? And why is there an nt scheme, but no posix one? To be honest, I think there’s a lot lacking in how core Python documents the purpose and use of sysconfig. There’s little or no guidance for users of the library.

In the absence of such guidance, I think any solution we come up with will of necessity be a bit adhoc. But I do think it’s worth trying to agree and document something. Ideally, it should be a formal document - in the absence of getting something in the core Python docs, would people be willing to write up whatever consensus we achieve as a PEP?

3 Likes

I don’t think we need a PEP or more documentation here. sysconfig fairly clearly documents the various schemes and expectations around these paths: sysconfig — Provide access to Python’s configuration information — Python 3.10.6 documentation

The underlying problem here is that we don’t have any mechanism in sysconfig that Linux distros can use for their own tooling placing files in /usr/..., while having the end-users’ pip install runs use /usr/local/....

1 Like

Do remember that this module was written to reflect the path layout that already existed. Nobody (as far as I know) fully designed what those paths should be and why. Which is largely why we are having so much trouble choosing a way to customise them.

It may be worth a document (PEP?) that defines things properly and also rationalises and preserves enough of the existing behaviour (similar to what PEP 514 did).

2 Likes

I think the best way is to add a custom scheme instead of modify the existing and override get_default_scheme() and/or get_preferred_scheme()

However, we have already tried to have a scheme with {base}/local/... and it had very dangerous undesired side effects when users explicitly pass --prefix to some tools (such as pip). It did not really matter if we had our own scheme or overrode posix_prefix. We chose the latter to make distutils (and setuptools) respect it.

I was bitten by this while testing out my PEP-582 code. Thankfully @pradyunsg helped me to find the cause.

1 Like

Procedural note, but if it is a change proposal that requires ecosystem coordination, then a packaging PEP might be the appropriate place to propose it, and once accepted, documenting accordingly in either the sysconfig docs or a new PyPA packaging spec. If it is more documenting existing (best) practice, or detailing the “why” behind each scheme, then e.g. an Explanation-type doc or section in the sysconfig docs would probably make more sense.

This is essentially what I was getting at when I said the docs need more guidance. What should tools like pip do when asked to do a --prefix install? From the stdlib docs, it appears that installing to get_preferred_scheme("prefix") is the correct thing to do. So if that doesn’t work, then it seems like there’s an issue in sysconfig - or Fedora aren’t patching it in the intended way (by which I mean the docs don’t explain in sufficient detail how distros are intended to patch sysconfig).

I assume that the main problem we had is that we changed the scheme from {prefix}/... to {prefix}/local/... while users actually assume the /local bit is part of the prefix. What we really needed to do was to change it to {local_prefix}/... but introducing new config variables is tricker than I thought.

(I use {prefix} above as a placeholder for {base} and others etc.)

For what it’s worth, this is exactly the reason we had problems in spack. Everything was designed around --prefix actually installing to the provided prefix rather than some subdirectory of the prefix that we can’t change. Is there a term for what the local/ token is in this context if it is not currently part of the prefix?

I don’t think so. I think this is a bug.

Can you elaborate on what the side effects are?

Well, see the initial post of this topic.

Also:

https://bugzilla.redhat.com/show_bug.cgi?id=2026979

https://bugzilla.redhat.com/show_bug.cgi?id=2097183

To be fair, I don’t think those effects are in themselves “dangerous undesired side effects”. Rather, I think they are precisely the intended behaviour if you set the posix_prefix scheme to use paths like {base}/local/... and set the preferred scheme for prefix to posix_prefix. The problem seems to be that you (Fedora) didn’t intend to affect --prefix in this way, and yet you can’t find a way to implement your actual intention in a way that doesn’t also affect --prefix. I’m assuming that it’s uncontroversial here that when someone passes --prefix <dir> to pip, pip should take that as meaning “use the preferred scheme for the prefix key with base set to <dir>”?

Maybe the issue here is that the posix_prefix scheme is being used in multiple ways, and the meaning of {base} isn’t consistent across all of those ways? That’s what I had in mind when I said that the sysconfig docs need more guidance. Someone is clearly using posix_prefix with a value of base that doesn’t match the expected usage. The problem is that we can’t work out who :slightly_frowning_face:

1 Like

If I am not missing anything, it should be able to solve by the aforementioned approach:

  • add a custom scheme, say fedora_prefix, pointing to {base}/local/...
  • Patch sysconfig.get_default_scheme() to return fedora_prefix
  • pip install should use the scheme returned by get_default_scheme() if --prefix isn’t given.
  • pip install --prefix should use the scheme returned by get_preferred_scheme('prefix')

This requires changes on both sides

2 Likes