I would like to use entry-points to load alternative implementations (i.e. to dispatch if available metadata is right, but have all information always available for introspection).
Additionally, I would like to make sure that nobody does a costly import by accident, but with a typical entrypoint.load() in a Python file that may be part of a heavy-weight package, it is easy to get slow imports.
Because of that, I want the entry-point to only point to a bunch of metadata.
During a discussion, the idea came up to make the entry-point not load a Python reference (to something like a dictionary) but instead load all relevant data from a .toml file which the entry-point points at.
Now I tried this and it works fine! Although the code to actual load the .toml you need something a bit manual:
def load_entrypoint(ep):
mod, _, filename = ep.value.partition(":")
# `mod` seems to need to be top-level to truly avoid imports
spec = importlib.util.find_spec(mod)
reader = spec.loader.get_resource_reader(spec.name)
with reader.open_resource(filename) as f:
return tomllib.load(f)
(This seems to work, I suspect zipped packages are OK, dunno if there is a niche I may be missing.)
However, importlib_metadata (new versions) seem to complain already when discovering entry-points (not wrongly, they notice that ep.value isn’t valid if it has a /! Of course if there is no / in the path it won’t notice this but…)
Also, validate-pyproject also correctly complains.
Question: It seems a bit that this is a highly unusual thought to discover an arbitrary file rather than Python object reference via an entry-point.
Is there an alternative pattern for something similar to this, does it seem OK besides that it was never really intended, or does this seem like a bad idea best to avoid?
I am not sure I fully understand what you are trying achieve so my suggestion here might very well be missing the target…
Shouldn’t you be using importlib.resources to load the file? One of the main goals of this library is to take care of zipped packages and all that stuff. Somehow the usage documentation is in the back port’s documentation (importlib_resources), but it should apply to importlib.resources as well.
Maybe something like this (completely untested, consider this is pseudo-code):
from importlib.metadata import entry_points
from importlib.resources import files
for ep in entry_points(group="my_app.plugins"):
mod, _, filename = ep.value.partition(":")
data = tomllib.loads(files(mod).joinpath(filename).read_text())
Yes, but I assumed that reader.open_resource does the right thing, that said, I haven’t actually tested with a zipped package yet, so may have to adapt.
Right. The point of this, is maybe me being a bit pedantic (to be clear, I decided to just go ahead either way, napari does this for example already).
My observation was that the following:
[project.entry-points.'myapp.plugins']
a = 'module:submodule/reference.toml'
will cause current importlib_metadata to break when loading any entry-point. I.e. you have to write it as module.submodule:reference.toml to ensure that it looks like a valid Python reference (something you could get with from module.submodule import reference).
Of course that isn’t a problem in practice, it is easy to make sure things look like a typical entry-point (i.e. Python object reference). But it surprised me that things cared about what the value is there (beyond EntryPoint.load() failing, of course, but I’ll never call that).
(Anyway, I opened an importlib_metadata issue, because I think it would be better to only fail on EntryPoint.load() and not fail if there is a single bad entry-point anywhere.)
For some reason, the other day I missed the information stating that the value of an entry point must be an “object reference”, in other words something that can be imported (and called?). So that is what I had based my posts on. I thought value could be any string. Today, I can see that I was wrong, it is indeed expected that value be a string that resolves to an “object reference”. My bad.
Thanks @sinoroc. That is roughly what I am using now. I should have maybe explained why I didn’t initially (or don’t quite):
The reason is that if you load the spec for module.submodule, module will get imported, and basically the whole point of using a toml was to avoid such a (potentially expensive) import :).
So I write module.submodule:reference.toml but then actually load submodule/reference.toml from module. I suppose there might be weird things where that isn’t ideal, but for now I assume it’s fine.
It just rubbed me slightly wrong that importlib_metadata was violently opposed to putting a value that isn’t a valid python references and that made me wonder if there may be a bigger reason to not use this pattern.