I would like to extend the direct_url.json spec, clarifying that the url field must contain a url: pypa/packaging.python.org#1506. The motivation is that we have encountered non-url url entries (astral-sh/uv#1744).
Relatedly, would it make sense to write a json schem for direct_url.json? A machine readable spec would allow tooling to generate types and would simplify verification.
@pradyunsg was BDFL-delegate on that PEP, but I can confirm that this seems like a reasonable change and I’d be happy for it to be done as a text-only change to the spec. I don’t think we need a PEP for something like this.
The linked issue does demonstrate that this is happening “in the wild”, though, so I think the spec probably needs to take that into account, and say something like “producers of this file must use a valid URL, but consumers must be prepared to encounter arbitrary text in the field”. We should also confirm that poetry are happy with the spec change, and are willing to update their code.
My hope was that this would get fixed in poetry (python-poetry/poetry#8999) so that in the future can consumers can assume that the field indeed contains a url.
That’s reasonable, all I was trying to say was that “be lenient in what you consume” applies here. People can always have non-compliant data (if only by manually editing it), but in this case the change of spec could be the reason, so let’s acknowledge that. I don’t want to make it a big deal, though, so sorry if it seems like I am.
Could we include that schema in Direct URL Data Structure - Python Packaging User Guide? At least for me it’s easier to clear ambiguities with a schema and it allows everyone who has to interact with these files to generate typing for their deserialization library of choice.