PEP 648: Extensible customizations of the interpreter at startup

For shared environments (such as a “standard” Docker container in an organisation), yes. Someone depends on it, and so it has to be there, but someone else does not.

Ultimately, the ideal case should be for packages to not rely on startup customisations at all but do it at import time. Nearly every scenario I’m aware of can do this transparently, and the rest can definitely do it with an additional explicit import.

So even though I think this is a good proposal, I still would strongly discourage anyone from ever modifying it in a publicly distributed package. It’s very useful for system integrators (aka the installer, not the installee), but not something that can transparently Just Work.

1 Like

It’s not a failure. A strategy that has been appropriate for the last 20 years, no longer is. It’s a breaking change that distro package maintainers will need to be aware of. I’m sure they’ll deal with it but it isn’t transparent to them.

1 Like

… or the strategy is still appropriate, and lazyimports shouldn’t be unconditionally messing with core Python behaviour even for programs that don’t import or use it.

I get what you’re saying, that this feature has the potential to be abused, but that doesn’t mean that we shouldn’t allow the feature, but rather that the fault lies with the abuser (in your hypothetical case, lazyimports), not the victims.

1 Like

I’ve justed tested this and -S breaks venvs, as of Python 3.9; with -S Python cannot see packages installed in the venv. We have a slightly forked venv system but this is the same reason why -S is a non-starter for us.

I wouldn’t expect -S and working venvs to be mutually exclusive but -S is documented as completely disabling site and site is documented as handling reading pyvenv.cfg so it would appear that is not a bug.

1 Like

I just wanted to say that overall I’m +1 on this. Though, I’ll note this is just syntactic sugar over what we already have and provide with pth files. virtualenv for example uses a _virtualenv.pth that does import _virtualenv and a corresponding _virtualenv.py, mostly to patch distutils to pay nicely with packaging.

1 Like

Thanks again for the feedback @mauve, but I think this might be more an issue on the way that the dependencies are packaged as you mention. Having a dependency that you do not use and that is causing you harm, seems quite unproductive and almost a mistake.

As mentioned before, I’d be surprised to see many libraries doing such a thing though, and most of them focusing on “making the library work”, but of course there are no guarantees that a library developer will not misuse the feature. That said though, I’d like to note that I think this is probably a net gain for you (I might be wrong though). You mentioned that you had to walk pth files and patch/delete some of them, due to libraries not doing the right thing for your application. With this feature, you’ll be able to easily scan all those customizations into a single folder. Knowing that all scripts in that folder are indeed customizing the interpreter, and allowing you to remove any if necessary.

1 Like

Which we want to deprecate, so this would become the replacement. View it in that light, rather than just an alternative way to put arbitrary code in pth files.

3 Likes

I do. My point was more along the lines this new feature does not any add or remove any more security flaws than what we already have out there today. Which is not a bad or good thing. Though, yes it will open the door of eventually deprecating pth files.

1 Like

While I like getting rid off code in the .pth files, I’m not sure how the proposal will help with a rather typical setup for system wide installations of Python:

  • you want to use system provided packages whereever possible and only rely on pip installed versions where needed (simply to benefit from the OS vendor support as much as possible)
  • you may want to setup certain details in a system wide way, e.g. use auditing hooks for all Python scripts
  • you want to have users (or more general: accounts used for applications) be able to add additional customizations for their Python environments, which get loaded in addition to the system provided ones, e.g. configure set sys.ps1 and sys.ps2 or install an exception or display hook.
  • finally, you want all of this to work seemlessly in virtualenvs or venvs.

I don’t see how a single customization approach can handle these cases. Perhaps I’m missing something, but the way I read the PEP, we’d need a __usercustomize__ dir to mimic the usercustomize.py behavior and allow use level customizations in the same way.

Moving the __sitecustomize__ dir to the user’s dir or even inside a virtualenv, will have it override the system wide customizations, which admins will not want. They’d also not want to give users write access to the system wide __sitecustomize__ dir.

Something else I don’t understand is how the proposal would help with packages wanting to add something to such customization dirs: how would those scripts get added to the customization dirs (should pip put them there ?) and more importantly: how would the order be determined (I can already see packages fighting for the best 000000_run_me_first.py entry :slight_smile: ) ?

Given the virtualenv setup, I imagine that this would need yet another special dir, e.g. __venvcustomize__, so as to not have packages installed in the virtualenv override user level settings.

I guess a work-around for all those special dirs would be to use a single dir name for all these customization dirs and have site.py go through all sys.path entries and add all (let’s say) __pycustomize__ dirs to (say) a sys.customizedirs list which it then processes as described in the PEP, one by one and from left to right.

I’m still worried about the startup time implications, though, since importing lots of small customizations scripts for every single Python start will cost a lot of time. This is actually the major drawback of the current .pth file approach as well. The PEP approach would only save startup time for compiling the code in those .pth files.

3 Likes

Thanks @malemburg .

On the different site paths, the proposal evaluates all of them. All site paths will be scanned for __sitecutzomize__ the same way they are already scanned for pth files.

how would the order be determined (I can already see packages fighting for the best 000000_run_me_first.py entry :slight_smile: ) ?

Sadly that might happen indeed, it seems to not have happened with pth files though. The alternative: running things randomly seems worse than that though.

I’m still worried about the startup time implications

It should only impact systems that install those packages, at which point seems valid to spend that startup time, as users of those “requested” those packages.

(should pip put them there ?)

Maybe, I need to work with build backend maintainers on whether this is something we want to expose easily. It is already possible to do it, the question is whether to make it easier there as well.

you may want to setup certain details in a system wide way, e.g. use auditing hooks for all Python scripts

That should be fine, those system wide tools can just drop scripts in the new folder similar to how they were doing sitecustomize.py, but now with the added benefit of being able to slice things on different scripts.

you want to have users (or more general: accounts used for applications) be able to add additional customizations for their Python environments

fine as well, they can drop them in the same folder within usersite.

finally, you want all of this to work seemlessly in virtualenvs or venvs.

Should work as well, the venvs have site within it, they can just drop files there as well.

1 Like

Will a virtual environment “see” the customization files installed in the system and/or user directories? For sitecustomize.py and .pth files they don’t, so I assume you’ll say that the same applies here. But it would be good to be explicit, as @malemburg’s comments seem to suggest that he expects site-level customisations to apply in virtual environments as well (“so as to not have packages installed in the virtualenv override user level settings”).

2 Likes

I’m fairly certain user level sitecustomize.py is not active within virtual environments.

1 Like

Oh, totally misunderstood that, thanks for clarifyng. Yeah, I would not expect that to happen similar to how pth nor sitecustomize are available today. The interpreter will only see the scripts in __sitecustomize__ that are within a site path.

1 Like

Just to clarify and confirm @bernatgabor:

Python installed in a default venv will still look for system wide installed sitecustomize modules (in the main Python lib dir, not in the system site-packages), first in the system dir, then in the venv site-packages.

It only looks for usercustomize modules in case user local site-packages are enabled for venvs (which can be done via the system wide sitecustomize, but only has the effect of enabling usercustomize import – not the user local site-packages as one would expect). This is then also searched in the system dir and then the venv site-packages.

Now, in order to get access to the OS packages as mentioned, you’d create the venvs with --system-site-packages. In that case, you can also place the sitecustomize module into the system site-packages, since that’ll be on the sys.path as well.

This is only for venvs. The situation is different for regular calls to the system Python. I haven’t checked how virtualenv (the original tool) behaves. It all gets even weirder once you start to define PYTHONPATH as env var :slight_smile:

The logic around all this is highly complex and not necessarily intuitive. It’s codified in site.py.

@mariocj89:

What I meant is that you need to be able to have multiple __pycustomize__ dirs, which then all get scanned in order to maintain the logic we have for the different invocation scenarios. One dir is not enough to emulate both sitecustomize and usercustomize, since you may well have the case that packages are pulled in from multiple directories on the sys.path at various levels of the directory hierarchy (system level, user level, application level, etc.).

Of course, you can argue that usercustomize is no longer needed with venv et al. and I wouldn’t disagree :slight_smile: site.py has grown much too complex over the years and what may have been useful 15 years ago, may well no longer be needed.

2 Likes

Absolutely agree, the current proposal is already tackling that, it might not be clear enough though.

When site.py is adding each of the site paths, one of the sites that it is adding is usersite. At that point, it will also scan for __sitecustomize__.

1 Like

There’s Entry points specification — Python Packaging User Guide. Are you thinking of something more than this?

1 Like

Yeah, put entrypoints · PyPI in the stdlib :slight_smile:

1 Like

What does entrypoints offer ober importlib.metadata?

1 Like

Why when we have importlib.metadata there instead? :wink:

1 Like

Better advertising, apparently :clown_face:

Given that, why aren’t we just picking an entry point name to execute at startup? That should meet all of the existing needs for venv/nosite/etc., right?

(For those who also didn’t realise, https://docs.python.org/3/library/importlib.metadata.html#entry-points)

3 Likes