PEP 648: Extensible customizations of the interpreter at startup

mariocj89 · January 10, 2021, 2:14pm

I’m not sure about that. How will entry points fulfill interpreter customizations like fault handlers, installing custom signal handlers, exception handlers, logging customizations, or any special setup needed to load a particular library? Those are unrelated to any entry point that is executed unless I misunderstood how it would work.

For things like virtualenv, it might work, as it might be possible to replace the python executable that it exposes to users via a custom entry point, but then, how will entry points compose? other entry points will need to know that they need to call another entry point to maintain such behaviour.

steve.dower · January 11, 2021, 9:21am

Entry points are just registered module:function names, associated with an intent. One such intent is “console script”, which tools can use to generate executables to launch that entry point directly. Another is “setuptools command”, which is used by setuptools to identify extended commands provided by libraries (such as bdist_wheel).

So I’m not suggesting reusing the console script intent, but adding a new category that CPython will read at startup and call each registered function. Should be as simple as specifying the name of that category and implementing it, as the semantics are all already defined (for better or worse).

brettcannon · January 13, 2021, 7:39pm

To give a concrete example, let’s say I have a package spam that has some startup stuff it needs. Let’s also say that Python supported a startup entry point in the python group. Using PEP 621 I could define the following for my spam package:

[project.entry-points.python]
startup = "spam:startup"

Then Python would just check all the installed packages for a python group with a startup key and then execute that callable.

mariocj89 · January 14, 2021, 8:54am

I see. Thanks!

Yeah, I think that makes sense and that is probably how we can implement the part of PEP648 where I mentioned: “We will also work with build backends on facilitating the installation of these files.”.

But before we can implement that in a library, we need a way for the interpreter to execute those, isn’t it?

If we have PEP648, we can then just implement that by injecting a script into the __sitecustomize__ folder of the site where the package is being installed.

steve.dower · January 14, 2021, 11:06am

More likely we’d just add it directly to site.py, if it was going to be standard. But yeah, it would be easy to test with a customization.

I still like having the site customize folder, but only for the admin to manage, and not for packages to put stuff into. It’s really hard convincing my colleagues that they should be customising their deployed environments rather than relying on pip in production (we have some… unusual scenarios), and I’d rather give them one more reason to own that step themselves than one more way for third parties to introduce breaks.

jaraco · January 18, 2021, 7:49pm

This approach might also have the benefits of allowing for extension of the behavior, such as addressing the sorting concern, for example by honoring an optional ‘priority’ or ‘order’ attribute on the resolved module or callable.

mariocj89 · January 20, 2021, 4:05pm

More likely we’d just add it directly to site.py

Not sure I understand what you mean by that. What would site.py execute? The packages will need to create “something” that is executed, isnt it?

pf_moore · January 20, 2021, 4:40pm

I assume the idea would be that something like the following would be added to site.py:

from importlib.metadata import entry_points

eps = entry_points()
core_eps = eps["python"]
for ep in core_eps:
    if ep.name == "startup":
        fn = ep.load()
        fn()

It should be possible to prototype this by adding the same code to a usercustomize.py file.

Individual packages just declare an entry point:

setup(
    ...
    entry_points = {
        "python": [
            "startup=mypkg.main:startup",
        ]
    },
)

Basically, as I understand it, what @steve.dower is saying is that this proposal doesn’t need any changes to core Python, just to the stdlib site module. The same is of course true of the original PEP 648 proposal…

mariocj89 · January 21, 2021, 7:38am

Right, understood. Thanks a lot :).

I think I prefer the __sitecustomize__ approach as there is “just one way” to customize things at startup, additionally, I’d expect the __sitecustomize__ option to be more performant as it is a single dir to scan for if implemented as such.
Another option would be to implement the startup entry_points by just dropping them in __sitecustomize__, I really think that brings the best of both worlds (maybe not call them entry_points though).

Of course, that is just an opinion .

davidism · January 21, 2021, 2:46pm

If we’re going to pick one, I prefer the entry point idea. It seems like it’s a lot easier to introspect, both at the package level (can look at setup.cfg before you even install it) and at runtime (use importlib.metadata.entry_points() as shown above).

pf_moore · January 21, 2021, 3:12pm

Agreed. As an established mechanism, entry points are supported by existing tools. Whereas, if we use __sitecustomize__, we’d need to consider adding a means of querying what startup scripts exist, what package owns them, etc, etc. That’s all covered by the existing entry point machinery.

mariocj89 · January 21, 2021, 3:28pm

To be honest, I never thought for the need of tracking a script back to the package that installed it further than it being present in the RECORD/installed-files for uninstall.

Why do you think a user might need that? What would a user do interactively with these files?

bernatgabor · January 21, 2021, 3:54pm

Would provide better origin tracking by using existing interfaces, and would also fail more graceful when two packages try to generate/use the same name. Not?

pf_moore · January 21, 2021, 4:01pm

I don’t honestly know what sort of things packages will end up doing in startup scripts, but I’m sure someone will want to know where an unexpected startup action came from, or what things are happening at startup.

With the PEP, it’s not even that easy to know where to look for __sitecustomize__ directories - “a folder named __sitecustomize__ located in a site path” could be a number of locations. And having found the file, checking RECORD files for who owns it is pretty non-obvious unless you’re a packaging specialist. Conversely, for entry points, all of the discoverability is already there, and supported by existing machinery.

I’m honestly not sure why you see any advantage to implementing a new mechanism, when entry points have been around for ages and handle the job well. It would be new to have them used by core Python, and if they were still a setuptools feature, that might well be inappropriate, but now that importlib.metadata supports them, they are a standard mechanism and I think we should use them.

bernatgabor · January 21, 2021, 4:15pm

I’ll note here though that using entry-points mandates parsing the entire site package(s) folders for .dist-info and then parsing the metadata within it. Which can be a significant startup overhead when using a network/slow disk and have lots of packages installed. And this must be performed at every startup. E.g. you have to parse the metadata for all installed packages (in my experience 120+ in a system site-package), even if all you want is to print the number 1.

steve.dower · January 21, 2021, 6:43pm

This is a very good point, and one in favour of having just a directory of scripts as the core mechanism (even if packaging tools use entry-point-like definitions to generate it).

Me neither. The only things I can think of is providing an encoding without having to modify the app itself, or overriding the interactive prompt (which is currently possible by providing a readline module). I have used it once to inject some audit hooks, but that was a narrow enough case that “add an import statement to sitecustomize.py” would have sufficed.

Everything else I can come up is something that can be done on an import that was going to happen anyway. Perhaps with a bit more work on the part of the module being imported, but better than forcing everyone to do it whether they need it or not.

davidism · January 21, 2021, 6:48pm

Speaking of installing, how is a package meant to get something in to the special directory? The directory is external to the package and isn’t necessarily in the same path as where the package is being installed.

Packages that use (abuse?) pth files seem to rely on executing setup.py install. Here’s some examples:

davidism · January 21, 2021, 6:52pm

I’ll also say that while I’d vote for entry points if I had to, I’d rather not do anything. I’d instead encourage users to opt in to behavior at import. This can be done with a small wrapper, as demonstrated in future-fstrings.

steve.dower · January 21, 2021, 6:55pm

In general, .pth files can go at the top level of a package.

To do it properly, we’d specify something that can go inside .data in a wheel, and then it’s up to build backends how they let you put it there.

steve.dower · January 21, 2021, 6:59pm

The contortions in setup.py you’ve identified are mostly because setuptools doesn’t let you customise your package layout as much as you need here. (Which is the reason I gave up and wrote my own backend so that I could )