The word “owns” is used throughout, but I’d much prefer “provides”. Unlike the PyPI namespace proposal, there’s no exclusivity here - just because I specify a name in my metadata doesn’t mean nobody else can also specify it. It does mean they probably shouldn’t be installed simultaneously, which I see you’ve captured in the text, but that can still be there without implying exclusive “ownership” of the name.
I think the specification is too strict. “After installing the distribution, the specified names will be importable” is basically sufficient, and every detail you go into about how it might be imported is just another check someone is going to have to write and/or complain about. They’ll probably write it anyway, but I don’t think we gain anything from spelling it out here.
The specification currently prevents setuptools from specifying distutils with this metadata, because they use a different mechanism. Now, personally I don’t like their mechanism, so the last thing I want to see is a PEP saying that it’s a valid approach But I don’t think it should be excluded, and the easiest way to not exclude them is to simply remove most of the specification from the PEP.
Probably most of what’s currently in specification could be “Guidance” or “Recommendation” or even just examples?
Similarly, “how does this interact with “Dynamic”” is a good discussion point, but I don’t think it warrants more specification than just “Dynamic operates as usual”, followed up by “but if you use it here’s some examples of the mess you could create; we don’t think consumers should have to deal with this and so suggest that they should probably just ignore you”. Framing it as a discussion point lets us explicitly say that we don’t think it’s valuable or helpful for this to be dynamic, which we can’t really say in a spec, but it ultimately means we don’t have to increase the burden of implementing or consuming the metadata by giving it its own set of custom rules.
Build back-ends MAY support dynamically calculating the value on the user’s behalf if desired, if the user declares the key to be dynamic.
Does this “dynamic” literally mean “dynamic”, or some other declaration? Because this is the exact kind of over-specification that’s going to get into complications (if the backend calculates it, is it now static? Is the backend allowed to change the value of dynamic when it becomes Core Metadata Dynamic (according to at least one spec: no)? So even if it could be static from sdist->wheel, Core Metadata will always say it’s not?).
This complication isn’t due to this PEP, it’s because the existing ones overspecified behaviours, but the first part (“Build back-ends MAY support …”) is where this one could get into trouble in the future. Best to drop this entire line and let it be covered by “dynamic behaves normally”, so that if we ever figure out how to do inferred metadata without it having to also be dynamic, then it’ll become possible to do this. But as it stands, I don’t think this suggestion is ever going to be simultaneously “legal” and useful.
Looks good to me. I can understand @steve.dower’s concern over the term “owns”, but I think that it captures the idea of not including (shared) higher level implicit namespace packages well. I’d be OK with simply clarifying that “owns” does not prevent other projects from using the same import name.
I think the details in the PEP are reasonable, because they directly address a point made in the pre-PEP discussion, that this is project level metadata rather than distribution file level. The wording’s a bit clumsy - maybe it would be sufficient to say “The Import-Name core metadata cannot be marked as Dynamic, because it is required to be consistent regardless of which distribution file is used to install the project”.
As far as pyproject.toml is concerned, this again is addressing a specific point in the pre-PEP discussions (why not just let the backend calculate the value) so I’m happy with leaving that in, although maybe I’d word that a little more clearly:
Build backends MAY support dynamically calculating the value of Import-Name on behalf of the developer. If the developer wants that to happen, then the import-names key must be marked as dynamic in pyproject.toml.
To be clear, I’m happy with the intention behind them, but I don’t think the specification is going to work (and in 12-18 months time we’ll be having more debates about them).
Expressing the intention and giving some guidance to consumers is far more manageable than creating a specification that exempts/changes the rules around existing specifications.
My proposed alternative would be “While Import-Name core metadata can be marked as Dynamic, having different values for individual artifacts that are part of the same release defeats the purpose and is strongly discouraged. Consumers are justified in ignoring dynamic Import-Names and proceeding as if it were unspecified.”
That seems to tie into your desire to change the wording to “provides” as that suggests you want the spec to be “list what you can import post-installation” and leave it to guidance to be more specific. But I do think it means namespace packages could then just list their top package name and still meet this definition since they do provide that top name as well (e.g. a.b.c is a namespace package, but my reading of what you’re proposing would say a is acceptable to be listed).
I’m personally flexible in either direction: either be very open and provide guidance, or keep it as-is to try and get people to provide slightly more useful information in the face of namespace packages via the spec itself.
How so? I thought they installed a .pth file? How is that not covered by the PEP?
Well, yeah, it would. But if it wouldn’t be of any benefit to the package to do that, then they’re the only ones who lose.
The only real problem here would be a build backend deciding to force only providing the top level name, and not letting the package maintainers override it. I’m pretty sure that would be fixed without needing to appeal to a specification, as it seems pretty user-hostile, but I do think it only takes a single example to clarify the intent. And that’s much safer than trying to find the combination of wording to specify it.
The names provided MUST be one of the following:
* Highest-level, regular packages
* Top-level modules
* The submodules and regular subpackages within implicit namespace packages
Their .pth file installs an import hook that essentially substitutes import distutils with import setuptools._distutils as distutils, which isn’t covered by any of these three points. It’s not a regular package, it’s not a top-level module, and it’s not an implicit namespace package.
Obviously, I think we’d all fully support them specifying distutils as one of the importable names provided by that package. But someone is sure to complain that they’re breaching the spec (possibly even on their own team, as setuptools seems to try pretty hard to push people in line with packaging specs, for better or worse…), and it’s an argument that can just be avoided by removing text from the PEP.
Ah, you’re interpreting that list differently than I am. For me, that list is the result of the import, while you’re viewing it as how the import works. So it’s covered under the first bullet point to me. But I guess that’s part of your point about wanting a looser spec.
@pf_moore as the one other participant here, do you have a preference over the looser, guidance-relying approach @steve.dower is promoting or a more restrictive one like the PEP has outlined? I honestly can go either way. One benefit to Steve’s is that’s closer to what can be verified as it’s just “is this a valid import name” and you don’t have to check for “is this a namespace package” or anything.
I don’t think there’s any way to trust this more than being project provided guidance. I actually like this language from the current text:
Each entry of Import-Name represents an importable name that the project provides. The names provided MUST be importable via some artifact the project provides for that version
but not the later limitations. if import {name} is expected to work and is provided by the project, listing it should be allowed. .pth file based name choices are included, and it doesn’t require enumerating all current means of modifying import behavior or updating for any future changes to import behavior.
I’m not a fan of vague specs, as they have traditionally ended up being a problem for us. But I agree that we need to be careful not to prohibit valid uses, because that’s also a problem…
As a starting point, “Import-Name must be a valid Python import name” is both accurate and uncontroversial. But without some indication of what the metadata means, it’s also useless. So IMO we need to go further (and unlike @steve.dower, I’m not keen on just leaving it to the judgement of project authors or build backends). We have the statement that tools should consider the metadata “accurate, but not necessarily exhaustive”, but we still need to pin down what “accurate” means.
At the simplest level, “if you install the project, names in Import-Name must be importable” covers the majority of cases. And so I think we should say that first, and then cover the exceptions / qualifiers.
I don’t think we need to explicitly distinguish package or module. That’s an implementation detail of the project (in just the same way as any other implementation method, such as setuptools’ redirection mechanism for distutils, is). But I do see an advantage to specifically stating that in the case of a project that exposes a name in a namespace package, the metadata should specify the name provided by the project, not the (top level) name of the containing namespace. I don’t think we should leave that to the project author to decide, as it’s important that we have consistency.
The other place where I think we need to be explicit ties into the whole paragraph talking about “importable via some artifact”, which I called out in my previous post. In almost all cases, I think “names specified in Import-Name must be importable when the project is installed” is far clearer and still perfectly accurate.
The problem case is where different wheels provide different import names. For example, if a project always provides myproj, but the Windows wheel also provides myproj_win. In that case, should myproj_win be in Import-Name? I’d argue that including it is not “accurate” (a Linux user would certainly think that), whereas not including it is OK under the “not necessarily exhaustive” qualifier. So my initial answer is that we should say don’t include names that are only importable from certain wheels. But my mind could be changed if there are real-world examples where this would be problematic (we’d need to adjust what we meant by “accurate” in that case, though…).
Either way, I think this is something where we should be prescriptive. The situation is rare, but both producers and consumers need to know the intended behaviour if the metadata is to be useful. It’s still about what is exposed rather than how it’s exposed, but IMO “provide guidance and let users decide” isn’t sufficient.
Maybe it make sense to have another new key that explicitly lists the namespace packages the project uses/cares about? Instead of having to infer that information from the list of names?
FTR, “the package that specifies a certain name in this metadata should be the one that provides the actual import” is a fine requirement from my perspective.
Though I can easily construct totally valid cases where spam-core might provide the spam module, but you’d prefer people install the spam package.[1] I don’t know of any specific examples of this that currently exist, but it’s a design that’s come up before.
Well, if the Linux user is looking at code that does import myproj_win and is trying to figure out where it came from, they might still find the metadata very useful. Do we agree on whether this metadata is for people who already know the import name? Or is it for people who only know the package name and want to find out the names they can import?
My entire argument is “leave it up to the project to decide, and if they choose something dumb then their package will look dumb”. You say we should be prescriptive, but I don’t see how that achieves anything other than excluding cases.
Which is likely a metadata-only package with a dependency on spam-core. ↩︎
So playing a bit of Devil’s Advocate here, what does it mean if a namespace package listed the top-level name? It’s still accurate that it would be importable, it just so happens that there are other projects out there that can legitimately satisfy that import equally as well. You would have to dig deeper and look for the longest name based on namespace depth to find the best match for the import.
I’m fine in general with that clarification, but it does tie into …
So if you reword your suggestion to “names specified in Import-Name must be importable when the project is installed on some platform for the same version of the project” then that let’s you list things that might not be installed for you but would be installed for me.
So it’s problematic for my motivating use-case: mapping import names that are not satisfied by what’s already installed. If myproj_win can’t be found in the environment nor via Import-Name, how does one figure out what project(s) provide that module?
Otherwise it varies per file which isn’t easily done if this is exposed in [project]. And it’s a minor annoyance in that you would have to check every file to get the complete list.
Maybe? I think it depends on where we land on other things and how important we decide this distinction is in the end.
I had you saying this exact thing in my head while I was writing my response to Paul, so at least I’m channeling your view accurately that way.
Surely it makes more sense for Import-Name to specify the union of names in the case of artifact specific modules. In the go-to use case for this PEP of someone typing import myproj_win into their IDE, you’d still want the IDE to recognise the dependency on myproj, even if the import is inaccessible on a platform that the script probably can’t run on anyway.
That’s a good point, and the answer is, I don’t know. It depends on the use cases, and I feel that we’re still rather unclear on what those are (or maybe I’m just not remembering them well enough) other than your specific one.
In that use case, presumably you’re downloading Import-Name metadata for all of PyPI and any other indexes the user might have specified as ones they “normally” use? (As an aside, would you take the metadata for every version of every project, or just the latest version and assume older versions are ignorable?)
I guess the answer is you’d do the same as you would if the user had mistyped the import name (myporj_win), or if the user had asked about an import name from a project where your download had failed (or was out of date). Or if we take Steve’s “guidance” view, the same as you’d do if the project hadn’t considered this use case when they created the metadata. Are we not just back to the point that consumers can’t assume the data is complete?
I absolutely wasn’t suggesting this should be per file. But I think we do have to answer the question “given that realistic cases exist where a project could install different import names on dfferent platforms, what should the project-level import-name metadata say?” If your use case relies on the answer to that being “it must include all import names that are available on any platform the project can build wheels for” then are there any other use cases that would find this a problem?
I’ll also note that regardless of which answer we choose, this will be a problem for build backends which want to generate this data. The build backend almost certainly won’t know, when building a Linux wheel, whether a Windows wheel will expose the same import names…
I’m coming to the conclusion that doing anything else could be a problem. But what worries me is that I can’t convince myself that doing what Steve suggests won’t also be a problem. See my point above - in Steve’s model, projects can just as easily break your use case, and while you could say that such projects would “look dumb”, I don’t think that’s fair. Only publishing the names that are always available feels like a perfectly rational choice to me - just not one that works for you.
I would think the ideal scenario is all versions in case what a project exposed changed across versions, especially if something got removed as that provides a hint as to a version constraint.
Yes, but I had interpreted “don’t include names that are only importable from certain wheels” as never including it purposefully by the spec, while the other is more by choice or oversight.
Correct, and so I wouldn’t say it’s “wrong” as much as “unfortunate for some people, but still a legitimate approach to this metadata”.
Would splitting it into two lists of importable names: “always provides” and “may provide” help without turning this into “this also needs platform/abi tag branching”?
I personally see value in knowing these, but am not sure it’s actually neccessary that the per distribution names are in one place (use RECORD if you already have a specific dist IMO)
Depends on how often you think this will come up. Adding two keys to [project] definitely makes this more complicated. And pragmatically I think this is more of an edge case and a regular thing.
Except this whole reason import-names in [project] is now in the PEP is because people didn’t want to rely on inferring what was available purely based on the file structure.
If you want to keep it one list for simplicity, I can agree with doing so, but I think this rules out other existing language in the PEP due to different platforms possibly having different importable names.
Namely, this language is incompatible with such projects:
Project owners SHOULD NOT filter out names that they consider private. This is because even “private” names can be imported by anyone and can “take up space” in the namespace of the environment.
And it also isn’t compatible with the motivation of checking if names might “take up space” (or otherwise conflict) when there are platform specific importable names.
I could see reasonably making it "names the project provides and supports, but I don’t see a way to have “expected to be accurate including private details that may differ by platform when provided” and only providing it at the project level.