Place for a direct_url.json parser

As part of PEP 600 and PEP 610 a new file called direct_url.json is created in the distribution folder. There are use cases (see importlib-metadata#404) where users want to get information out of this file. I created a draft PR where @jaraco asked, if importlib-metadata is the right place for this? Another possible place is packaging or maybe a completely new library?

So I open this topic to answer:

  1. does a parser make sense and helps users?
  2. where is the right place for this?
1 Like

What more is needed here beyond parsing the JSON contents of the file and taking data out of it?

nothing else I guess. Of course it’s possible to code it manually, but since it is a PEP-defined format, an “official parser” would be nice. In addition, the location can depend on the system, therefore another argument for a library to handle it

If this gets implemented, please note also PEP-710 which is in a draft state as of today.

A nice API (a simple function) that would immediately get information about a package could be nice. Without a need to repeat the *.dist-info path creation + related checks and transparently wrap what PEPs define for people who are not familiar with them.

1 Like

Indeed, most of what this “parser” does is make attributes out of fields and classes out of objects, providing a higher-level, more natural representation of the data. It’s similar to what packaging does, but with even less sophistication (behavior).

My main reluctance to include it in importlib metadata is because we chose to keep packaging out of the stdlib. That is, users of importlib metadata don’t get any packaging features unless they install it and ferry the metadata to it.

It would be inconsistent to then say that PEP 610 metadata gets a different treatment and gets first-class treatment from the stdlib.

Since there doesn’t seem to be any objection, I’ll plan to incorporate something into importlib_metadata (and thus Python 3.13), so please speak up now if you’re concerned.

1 Like

Let’s put this in packaging – having the JSON serialisation, validation, as well as parsing it into a Python data structure there seems reasonable since this is a standards-backed data structure.

I’ve filed Add data structure, parsing and serialisation for `direct_url.json` · Issue #701 · pypa/packaging · GitHub for this.

4 Likes

Hey all, this would be great to have. For myself, I’m looking to pin dependencies of my experiments beyond importlib.metadata.version(pkg_name). I’d like to be able to identify the packages installed editably, and check using GitPython if git.Repo(pkg_toplevel).is_dirty().

Currently I’m assuming all relevant packages have a site-packages/pkgname...dist-info directory, for which i can then look for direct_url.json. But I don’t really know all the ways packages can be installed. Still, slightly better than the SO question about this with a self-answered hack to look for egg-link.

You can get all installed packages using importlib.metadata and read their direct_url.json file from that. The following (untested) should work.

import json
from importlib.metadata import distributions

for dist in distributions():
    content = dist.read_text("direct_url.json")
    if content is not None:
        direct_url_data = json.loads(content)
        print(f"{dist.name} - url is {direct_url_data['url']}")
1 Like

I found that and tried it first, but content is None for my editable install. For my package mitosis, I can verify that

<path_to_env>/lib/python3.10/site-packages/mitosis-<version>.dist-info

exists and has direct_url.json with the required data.

EDIT: dist.read_text is reading files from the egg-info in my editable location, rather than in dist-info in site-packages. The former doesn’t have direct_url.json. Package was installed with pip install -e ., pip version 23.3.2, build-backend setuptools >=69.0

You should report that issue as a bug with the setuptools editable install support. I don’t know what the response would be - personally, I consider editable installs to be a “reasonable endeavours” situation, and getting this to work may well go beyond the point I’d consider reasonable. But I know others have higher expectations, so it’s worth a try.

2 Likes

Belay my last. I didn’t follow the code as written, instead using importlib.metadata.distribution("mypkg"), knowing that “mypkg” was editably installed. I was operating from the directory that was editably installed, so it distributins() actually found two distributions with the same name - one in current directory, one in site-packages that points to the files in current directory. distribution("mypkg") was returning the first one.