Is this even valid? Can you actually have a subpackage be a namespace package if the parent isn’t? I always thought this doesn’t work, and my quick testing supports this.
Same for the example by @sirosen. IMO Import-Name for a folder should imply that this package has an __init__.py file in that folder - which means it’s not a namespace package and will conflict with others. So that example should be
I think the example I gave is consistent with that reading. Or at least, it was meant to be – maybe I’ve missed something.
Brett was asking if we should require that these be enumerated explicitly or if we can allow some of the namespace structure to be implied. The example was intentionally a bit ambiguous – is spam.bacon a namespace? – to show why I think it’s best to be explicit.
Since a package is created/written once but read many times, I’m suggesting a bias towards whatever is best for the readers.
You could probably do it with import hooks. It wouldn’t technically be a namespace package in that situation, but it could behave the same.
IMO, Import-Name (and Import-Namespace if we decide it’s needed) should be defined in abstract terms, not in terms of implementation details like “namespace package” or “__init__.py”, as otherwise we’ll get into a real mess when import hooks get involved…
As far as I can see, both Import-Name and Import-Namespace should mean “this name will be importable if you install this package”. The only difference would be that it’s (probably) wrong to install two packages that both include the same name, unless both of them specify the name using Import-Namespace.
Is that a significant enough distinction to warrant two metadata fields? I’m honestly not sure. I’d be inclined to just specify the actual modules that are provided, using Import-Name, and leave it at that. So personally, I wouldn’t bother, but maybe the plugin scenario is important enough to justify it?
it makes it clear which names should be listed (at least with the rule that it should be exhaustive). Otherwise the confusion is always going to be “should Namespace packages be listed? What about this weird edge case?” By describing it with expected behavior for conflicts, I think it should be pretty clear.
pre-install conflict checkers have slightly better chances.
In the contributing project, just the submodule they contribute (so there is no change there):
Import-Name: spam.bacon.eggs
Contributing projects do not own either spamorspam.bacon, so they don’t have anything to say about them.
In the defining project, both the non-namespace base folder and the carve out for the namespace package would be mentioned:
Import-Name: spam
Import-Namespace: spam.bacon
The defining project may also declare its own contributions to the namespace (if it has any).
A package publishing spam.eggs or spam.bacon (whether as namespaces or as self-contained import names) would still be flagged as a conflict, but spam.bacon.eggs would be fine (since spam.bacon has been declared as a namespace).
It’s possible that a namespace may only have contributing projects, with no defining project declaring ownership of the namespace. That’s fine, since some namespaces are designed to work that way.
Listing something as both a name and a namespace would be permitted, indicating a namespace created by extending its own __path__ attribute in __init__.py instead of by omitting __init__.py entirely. However, it wouldn’t be required due to the way legacy namespace packages work (multiple projects all publishing identical __init__.py files, without that duplication indicating a semantic conflict). Instead it would be for cases where the package isn’t just a namespace package.
I would assume a tool like uv would say there’s a conflict at installation since you would be trampling another project’s __init__.py file to make that work. So you could list it that way and I wouldn’t expect the PEP to stop you, but my expectation is tools consuming this information might try to stop you or get confused.
That’s a good point that a build back-end may be able to help validate things if all levels of a dotted name are somehow accounted for.
I could also see a build back-end very easily offering to fill in Import-Namespace based on Import-Name for someone who doesn’t want to be so thorough as to list out the namespaces manually for that extra metadata check.
I don’t think I like this concept of ownership. There’s no real way to define ownership with a namespace to this level of formality. What happens if two projects both claim ownership? Is PyPI supposed to do something? This proposal doesn’t get into any index-level details and “ownership” in this regard seems to imply some special meaning to an index.
I think I like @sirosen and their interpretation as to why the distinction would be useful.
I don’t think we should limit based on how you technically create the namespace. It really should come down to if you install other packages with matching package names they too need to be a namespace somehow and not block the other’s existence (i.e. its submodules and subpackages from being installed).
If you’d prefer to avoid “ownership” due to the potential implications for index servers disallowing conflicts (which I’m not suggesting, since allowing conflicts at the index server level is a key benefit of decoupling distribution names from import names), then we could use wording along the lines of “expected conflicts” (this is slightly different from my original suggestion, since the presumption of authority associated with the Import-Namespace declaration is much weaker - it’s merely a statement that the package doesn’t expect to be the sole contributor at installation time, rather than it claiming to be the defining authority for that namespace):
Import-Name: when an import name is listed under Import-Name a package is indicating that it expects to be the sole contributor to that import location when installed. Another package declaring Import-Name or Import-Namespace entries that start with the declared name indicates a likely installation time conflict (unless the packages are being installed into different sys.path locations, in which case one may shadow the other rather than conflicting outright). Import-Namespace declarations in the same package may exclude nested import locations from this expectation.
Import-Namespace: when an import name is listed under Import-Namespace a package is indicating that it does NOT expect to be the sole contributor to that import location when installed. This metadata only needs to be provided explicitly when an Import-Name declaration in the package would otherwise indicate conflicts with other packages contributing to the namespace. Import-Name declarations in the same package may exclude nested import locations from this expectation.
For example, a package may declare pkg.contrib as an import namespace while declaring pkg and pkg.contrib.firstparty as import names, or even declare pkg as both an import name and an import namespace if pkg.__init__ extends pkg.__path__ in addition to defining other pkg level APIs.
It appears we genuinely disagree on that point, as what constitutes an installation time conflict varies based on the status and behaviour of __init__.py:
missing entirely: no package claims Import-Name ownership for the namespace itself, nothing installs an __init__.py, so no problem
__init__.py solely exists to extend __path__ in accordance with the rules for namespace packages: no package claims Import-Name ownership for the namespace itself, so even if checking RECORD suggests there is a conflict on __init__.py, they’re all semantically equivalent, so there’s no conflict in practice
__init__.py does more than just extend __path__: the package defining __init__.py declares the import name with both Import-NameandImport-Namespace, since its __init__.pyisn’t semantically equivalent to omitting the file entirely
We could ignore that complexity when the use case was just indicating which imports we expected to work after installation, but it matters if we’re trying to help provide pre-installation notifications of potential conflicts.
That’s the technical reason why it won’t work, which I don’t disagree with. What I’m saying is we don’t need to spell that out in the PEP. If installing your project will potentially mess up anyone else’s project that was already installed then you can’t say you’re a namespace package; that includes writing a __init__.py file since you can’t control for what other project had in their __init__.py. But if the mechanism ever changes for namespace packages then I don’t want to have restricted ourselves needlessly by over-describing what is or is not a namespace package by including that part of the language spec in the PEP itself.
Hmm, I’m not conveying my intent very well. Trying from a different direction.
Example cases:
Only Import-Name: pkg: any other package referencing pkg or pkg.* in any way is a declared installation conflict
Only Import-Namespace: pkg: any other packaging referencing Import-Name: pkg is a conflict. Import-Name: pkg.*, Import-Namespace: pkg, and Import-Namespace: pkg.* are all fine.
Import-Name: pkgandImport-Namespace: pkg: another package referencing Import-Name: pkg or Import-Namespace: pkg are both conflicts. Import-Name: pkg.* and Import-Namespace: pkg.* are both fine
Import-Name: pkgandImport-Namespace: pkg.ns: Import-Name: pkg.ns.*, Import-Namespace: pkg.ns, and Import-Namespace: pkg.ns.* are all fine. Import-Name: pkg and Import-Namespace: pkg, or references to submodules other than the declared namespace are all conflicts.
Import-Name: pkgandImport-Namespace: pkg.nsandImport-Name: pkg.ns.firstparty: Import-Name: pkg.ns.*, Import-Namespace: pkg.ns, and Import-Namespace: pkg.ns.* are fine except where pkg.ns.firstparty is concerned. Import-Name: pkg, Import-Namespace: pkg, or references to any submodules other than the declared namespace are all conflicts, as are any references to the declared first party namespace submodule.
The only relevance of the specific implementation details is to illustrate why the early conflict detection use case needs this level of semantic expressiveness in the metadata when the “what is available for import?” use cases don’t: without it, the metadata won’t be accurate enough to build robust conflict reporting mechanisms.
Edit: the implementation details also reveal that the “just check RECORD” approach may generate false alarms for legacy namespace packages, adding weight to the proposal in general.
Reviewing the details, while I appreciate the following attempted simplification, I’m not sure we can actually get away with it:
Tools SHOULD raise an error when an entry in Import-Name is higher than Import-Namespace in the same project, e.g. project.import-names = ["spam"] and project.import-namespaces = ["spam.bacon"]. This is because if a project exclusively owns a higher import name then that would mean it is impossible for another project to install with the same import name found in Import-Name in order to contribute to the namespace listed in Import-Namespace.
The problem is that this restriction isn’t part of the rules for namespace packages in general, it’s only part of the rules for implicit namespace packages. A submodule doing its own __path__ expansion is free to add portions from folder that aren’t included in the parent package’s __path__ (for example, pkgutil.extend_path in a subpackage checks every sys.path entry for matching portions, it doesn’t restrict itself to those portions that correspond to __path__ entries in the parent package).
While we could declare that approach unsupported in the metadata format, I personally consider it a valid way to define a plugin or contribution namespace.
Handling that bit of complexity was the reason I suggested keeping the ability for packages to say “I have no opinion on the state of the parent modules” when contributing a nested module name.
The updated field definitions otherwise looked good to me, at least conceptually.
I believe there are other parts of the text affected by the “hybrid namespace package” problem, though. This was just the paragraph that would most clearly have to be updated further to handle that case.
I assumed that, I just didn’t want to do another pass just to discover more feedback needed to be taken into consideration.
I’ll do a pass to loosen the PEP to say you need to account for all intermediate names, but there’s no hierarchy requirements that you have to go from namespace names to import names.
Since it’s European vacation time, I’m going to leave this discussion open until the end of August (instead of next week), at which point I’m submitting the PEP for proposal regardless of where the discussion sits (PEP 751 taught me that strict timelines are worth it for my mental health).
I think your last updates fixed the remaining issue I was concerned about. (Since we’re not actively encouraging pkgutil & pkg_resources style namespace packages, just tolerating their existence, there’s no need to go into detail on how they work in practice)
I think the PEP is a nice step forward (I wish it were possible to fully automate the mapping of projects to import names though).
I have a question about the adoption of the PEP by existing projects. The PEP says that the pyproject.toml specification gains project.import-names and project.import-namespaces (by the way, the PEP uses project-names and project-namespaces at one point) and that tools can support calculating these dynamically. I am guessing they are optional new keys and the recommendation would be for tools to produce empty entries in the metadata when the keys are left out? Regarding empty entries, the PEP says
Projects MAY leave Import-Name and Import-Namespace empty. In that instance, tools SHOULD assume that the normalized project name when converted to an import name would be an entry in Import-Name
When build tools start supporting the new metadata format but projects have not updated pyproject.toml to include the new keys, wheels with empty entries will start being produced for which the import names might not match the project name (unless tools require the new keys but that would be more disruptive). Should tools really assume the project name and import name match in this case?
I should change that wording to say, “not set”; I don’t think empty keys makes sense and I don’t think we do that for anything else.
There’s a core metadata version bump, so that will only happen if someone did a release with the new metadata version and still didn’t set the new fields. So yes, tools should still assume that as it only happens at a transition point where people had an opportunity to set the metadata and chose not to.