PEP 582 - Python local packages directory

Right, that’s the short term cost-- by adding one it will take longer to roll out, but once that’s rolled out that cost is paid and doesn’t require any ongoing pain.

The flip side is the long term cost of people on Windows who have multiple versions of Python installed (or who switch from one version to another), which is something that will just never go away.

1 Like

I agree, so speaking with my SC hat on, my first piece of feedback would be wanting versioned directories on all platforms.

3 Likes

I will push another update to the PEP tonight. We will have a function in sysconfig, which will return a partial scheme, that then can be used by the interested installers. And the paths will be versioned as you all suggested.

1 Like

Why not just use a normal scheme? IMO sysconfig is already difficult enough to work with, without adding the idea of a “partial” scheme.

The “partial” scheme here would include all the necessary keys to install wheels. It would provide purelib, platlib, scripts, and data.

If we have a full scheme, things will leak to the system, which is already a mistake we made with virtual environments, and none of the missing keys (stdlib, platstdlib, include, platinclude) should be installed to anyway[1].

IMO it is unwise to introduce new API/design that is bad by design.

You have two options:

  1. Build a full scheme yourself

    scheme =  sysconfig.get_paths() | sysconfig.get_local_packages_paths()
    

    You will have to write new code for PEP 582 anyway, so it’s not like this will be the thing preventing you from using old versions of installers (eg. pip), and I think it’s a reasonable enough ask.

  2. Handle missing keys in installers

    This would personally be my preference, and something installers should be doing anyway, as PEP 427 does not specify a canonical list of .data keys.

    The only breakage this may cause is not installing the headers data, which I think beats installing it on the system, a completely unrelated location. Realistically, this should basically only affect package building, which is recommended to be done in isolation anyway.
    But if you think it is too much of a risk, go with 1), but acknowledging its issues. Once the headers issue has been dealt with, you can then go back to just using the “partial” scheme as a full scheme.


  1. stdlib/platstdlib should definitely not be installed to, I think it’s clear enough why. include/platinclude are trickier, I would strongly discourage people from using them, as they are system directories, hence not isolated in virtual environments (!!), but installers do use them to calculate the path for the headers key, which is needed to keep backwards compatibility with the distutils install paths. I would also strongly discourage people from using that key, and recommend installers to raise a warning. Most projects have already moved away from using it, in favor of installing the headers as module data. Several, but not many, still remain, and IMO we should encourage them to move away from using it. ↩︎

1 Like

That said, sysconfig is suffering from being stuck with old design from the distutils’ days. I think it needs some work to make better match today’s model and make it easier to be used by installers and similar users, but I am still unsure how exactly to do that in a way that both makes sense and doesn’t break things. The documentation can definitely be improved though, I have struggled there too, but should take another look.

2 Likes

If we can use the existing APIs, and handle KeyError if it gets raised, I don’t see the problem. Why wouldn’t we? It’s the documented interface.

I’m -1 on creating a new interface for no better reason than to protect people who don’t support the existing API properly. (And I’m happy to consider it our bug, and fix it, if pip currently has an issue because of this).

I completely support this, and I would love to see improvements like you suggest. It’s a bit off-topic for this thread, but thanks for confirming that this is your intention :slightly_smiling_face:

1 Like

Hum… I interpret the current documentation as clearly stating which keys we should expect :sweat_smile:

From sysconfig — Provide access to Python’s configuration information — Python 3.12.1 documentation

Each scheme is itself composed of a series of paths and each path has a unique identifier. Python currently uses eight paths:

stdlib: directory containing the standard Python library files that are not platform-specific.
platstdlib: directory containing the standard Python library files that are platform-specific.
platlib: directory for site-specific, platform-specific files.
purelib: directory for site-specific, non-platform-specific files.
include: directory for non-platform-specific header files for the Python C-API.
platinclude: directory for platform-specific header files for the Python C-API.
scripts: directory for script files.
data: directory for data files.

From sysconfig — Provide access to Python’s configuration information — Python 3.12.1 documentation

sysconfig.get_paths([scheme [, vars [, expand ]]])

Return a dictionary containing all installation paths corresponding to an installation scheme. See get_path() for more information.

If scheme is not provided, will use the default scheme for the current platform.

If vars is provided, it must be a dictionary of variables that will update the dictionary used to expand the paths.

If expand is set to false, the paths will not be expanded.

If scheme is not an existing scheme, get_paths() will raise a KeyError.

But I guess maybe it is not that explicit? Though, reading that, I personally wouldn’t expect any of the listed keys to be missing. So, I wouldn’t really blame any users for interpreting it the same way.

All and all, I understand your opinion regarding adding a new interface, but I think in this case in specific, it would be the most beneficial choice :face_with_diagonal_mouth:

It does not have any major drawbacks that I can see, and solves the issue in a reasonably clean way, considering the current model and interface weren’t really designed to handle such use-cases. We just shouldn’t make it common practice.

I wouldn’t want to use it in pip, especially if it’s not considered “not common practice”. As I’ve said a few times, we’re trying to move pip to work consistently with (normal) sysconfig schemes. I don’t see a problem for pip if normal sysconfig schemes omit a path (I’d just say that if the user tries to install to such a scheme, any files in the wheel that would go in that path would be ignored with a warning). But if you feel that sysconfig schemes must have all of the listed paths, I’m fine with that.

I honestly don’t see why __pypackages__ can’t just have a normal scheme. There’s no reason I can understand why it has to be partial in any case. I guess that’s something @kushaldas needs to answer.

So my position is that I want the __pypackages__ scheme to be a normal scheme, which means that sysconfig doesn’t have to do anything special for it. If PEP 582 requires changes to sysconfig, my objection is with PEP 582, not with sysconfig.

I mean, we shouldn’t make adding new API a common practice. We should try to fit new use-cases into existing API and very carefully consider when and when not it makes sense.

This is essentially a normal config scheme for pip, and you will need special handling for the local packages scheme even if you use the existing API anyway.

There are two things to consider here.

  1. It will leak

    This is something I’d really like to avoid, but I suppose that is already the case with virtual environments, so it wouldn’t really be a blocker.

  2. The current API isn’t ergonomic for this use-case

    The API isn’t designed to require adding variables, you need to copy the variables’ dictionary, update it, and pass it to sysconfig.get_path/sysconfig.get_paths.

    sysconfig.get_paths('local_packages', vars=sysconfig.get_config_vars() | {'local_packages_base': ...})
    

    Similarly to the missing scheme path issue, this will result in some API breakage because we do not document some schemes might require extra keys and raise an error if they are missing. I don’t think it’s as bad as the missing scheme paths, but I do think it will cause some breakage.

I guess this mostly depends on how much weight you give to 1). Neither 1) or 2) are blockers, and I think 2)'s weight will be similar for most people.

Personally, I think the separate API is worth it, but it is not required. I’ve opened Deprecating the `headers` wheel data key to try to mitigate my worries with 1).

@kushaldas sorry for putting more weight on you, but I think you’ll have to make a call for what to put in the PEP. Or is it possible to give both options to the SC and have them choose?

Resuming, the two options are:

  • Normal sysconfig scheme
  • New API (sysconfig.get_local_packages_paths(base_directory))
    • The main downside is it being new API IMO
    • Easier/better to backport (no patching required)
1 Like

OK. You’re the expert on sysconfig, I don’t want to dictate to you on what API you think makes sense.

But as you say, it does put the responsibility back on @kushaldas to define (in the PEP) the exact layout of the __pypackages__ directory, and how installers should behave when installing wheels into that location - if we’re not going to say “just treat it like any other scheme”, then the details need to be given explicitly.

I’ll reserve comment on whether I’m comfortable that the proposal is reasonable to implement in pip until I see what @kushaldas proposes.

For the record, it was my suggestion that we use a sysconfig scheme, in response to the fact that the original PEP was too vague to be implementable. Sorry if by suggesting that, I’ve made more work for you.

I think it’s a good suggestion, it’s just unfortunate that the existing API and model make it a bit difficult to implement. Not having it as a sysconfig scheme is still a reasonable option, and as mentioned, I hope it to be able to better support this sort of use-cases better in the future.

Regardless if the local packages are implemented or not via a sysconfig scheme, sysconfig might want to provide a helper to facilitate implementing support for it. I haven’t thought much about that, and that is, of course, dependent on feedback from downstream users.

Thinking more and more about this, I don’t think there’s a clear answer, I think all options (no sysconfig scheme, new sysconfig w/ the existing API, adding a “partial” scheme w/ new API) are viable and none really lock us in a specific design for the future. Like I said, I lean towards the new API option, but any of the other approaches would be reasonable.

2 Likes

I think following this path will keep the behavior same (even with the leak as mentioned) with what we have today in sysconfig. I guess I will add this in the PEP, and open an issue against the PEP showing the option 2. I would love to hear what SC thinks on this part of the topic (which is not the main topic of 582).

1 Like

Regardless of whether it’s a “proper” scheme or a “partial” one, now that it’s been established that the existing prefix scheme can’t be used, the PEP does need the following changes:

  1. Specify the exact directory layout of __pypackages__, including how the wheel locations (pure lib, platlib, scripts, data, include) map onto it. This needs to be in the spec part of the text, not in an example.
  2. Acknowledge that there is no current way for users to install into the __pypackages__ layout, and modify the section on transition to take that into account (i.e., remove any comments that refer to pip install --prefix as a transition measure).

This will still work for posix systems as I see. Only for Windows we have to wait (or use the script I have).

Only if the sysconfig scheme is identical to the posix_prefix one (which it might not be if a redistributor has customised that scheme). But if you want to put that level of detail into the PEP, sure, go ahead. As long as you don’t give the impression that pip will support using --prefix as a way of installing into __pypackages__, that’s all I care about.

1 Like

If it ends up being some kind of partial scheme that only has a site-packages directory, can’t you install into it still using pip install -t __pypackages__/lib/python3.10/site-packages/.

That of course breaks if it ends up being a full scheme with bin dir and everything.

Maybe. I honestly don’t care, as long as there’s no suggestion in the PEP that it should “just work”. As the PEP is targetting this feature at beginners, I don’t want those beginners to have to be exposed to all of the issues with --target (no uninstall or upgrade support, does weird things with the bin directory, etc).

Similarly, I’m concerned that the PEP doesn’t suggest --prefix for Linux users, and then we get a bunch of Windows users whose __pypackages__ is broken because “I read that I should use --prefix and it doesn’t work”.

In particular, I don’t want pip to be the place those confused users go trying to get help. Both for their sake and for ours.

1 Like

I opened up an issue to track the suggestion of API. Meanwhile I am also updating (waiting to merge) the PEP asking to use a scheme to get the correct paths.

1 Like

There’s another popular blog post complaining about the status of pip install One Does Not Simply 'pip install' — Ian Wootten

I really think this PEP will help and benefit a lot of devs (juniors or any level) and hopefully lead to a better standardised process for everyone (in future PEPs).

1 Like