What information do you gather for interpreters and environments?

I’m thinking about a possible plug-in system for the Python Launcher for Unix, and I realized it generalizes to editors/IDEs in terms of the information one wants to know about interpreters and environments on the user’s machine.

If you were given the following bit of JSON in regards to an interpreter or environment on a machine, does it provide everything you would want to know?

[ // An array of of results.
  {
    // In case multple finders find the same thing.
    // Could be a path to an environment, executable, etc.
    "key": "/usr/local/bin/python3.10", 
    "python_version": {
      // `sys.version_info`
      "major": 3,
      "minor": 10,
      "micro": 1, // Optional
      "releaselevel": "final", // Optional
      "serial": 0 // Optional
    },
    "implementation": {  // Optional
        // `sys.implementation`
        "name": "cpython",
        "version": "3.10.1"
    },
    "distributor": {  // Optional
        // PEP 514 (https://www.python.org/dev/peps/pep-0514/)
        "name": "Python Software Foundation",
        "url": "https://www.python.org"
    },
    "executable": {
      // An array specifying what is required to execute the interpreter.
      // Expection is to append args to code to the end of the array before
      // execution.
      // For something like conda environments:
      // `["/path/to/conda", "run", "--path", "<environment>", "--no-capture-output"]`.
      "run": ["/usr/local/bin/python3.10"], 
      // E.g. 32-bit or 64-bit?
      "bitness": 64, // Optional
      // CPU architecture.
      // Can be as generic as "64bit".
      "arch": "x64" // Optional
    },
    "environment": {
      // What type of environment, e.g. "venv", "conda", etc.
      "type": "global",
      // The name of the environment, e.g. the prompt name for a virtual
      // environment.
      "name": ""  // Optional
    },
    // (Launcher-specific)
    // Is this environment/interpreter specific/exact to the situation?
    // E.g. the virtual environment created by Poetry for the folder,
    // activated virtual environment, etc.
    "halt_search": false
  }
]

I realize that if this plug-in idea works for the Python Launcher then editors/IDEs could take advantage of the same mechanism for interpreter and environment discovery. To be clear, this entire idea is speculative, but if it did pan out I wanted to make sure I wasn’t proposing something from the start that had an obvious flaw from an editor/IDE perspective.

1 Like

I can’t speak for the rest of the team, but from my experience this would be very useful to us with Spyder; we’ve spent considerable time (particularly over the past year or so) writing, testing and refining the environment discovery code and introspection routines to cover the most common cases (conda, system Python and venvwrapper/mkvirtualenv), and having a standardized, generalized system to robustly and cheaply expose that information would make environment detection more comprehensive, capable and performant, especially as we want to expose new UI/UX and features around this in future versions.

I’m assuming Optional items are always provided, but only if they are applicable to the given Python interpreter/environment (e.g. the environment has a name), since several of them, like micro version, name and type, are the key ones we need (in addition to the most crucial item, which is Python executable path).

@ccordoba12 any particular thoughts on this?

I guess the question here is if the notion of bitness is meaningful for the envisioned use cases independent of machine architecture (which is not included)? Would the latter make more sense instead?

I’m still pretty confused about what “specific” is intended to mean here. Could you clarify?

6 posts were split to a new topic: Identifying the editor running a program

For a more general use case, in addition to exporting sys.version_info, it’d probably be good to include a copy of the sort of info in sys.implementation (the implementation name and version). Then you could cleanly distinguish between PyPy and CPython, etc.

You’ll want at least what is provided for in PEP 514, if only for consistency for x-plat distros. Contact/support details for environments is valuable, I believe.

Though we probably want to better define ExecutablePath/executable to be an actual command line that can include arguments and other setup commands, rather than just a file path.

@brettcannon To avoid getting too off track from your original proposal, do you think maybe it would be a good idea to separate the comments on the interpreter determining the execution environment it is running in, versus the execution environment determining the interpreter/environment, into a separate thread? Or do you think it makes sense to keep them together?

I loathe to go too far into talking about any plans I have for the Launcher as it’s a little off-topic for what I’m asking about here, but my initial answer is no, it would not always be provided (for performance reasons). But with everything that is required, you can query the environment to fill in the (critical) gaps. So taking your example, once you know how to execute code you can run a script to get sys.version_info to fill in any details.

Bitness is here because you can have a 32-bit build and a 64-bit build of the same Python version on the same machine (and I have seen that in the real world), so you may want to tell them apart. I guess a user could have an Arm and x64 build on the same machine on macOS or something, but I have never been asked by a user for that level of detail in VS Code. Plus I don’t even know how to get that easily in a cross-platform way in CPython. But if people there could be use for it then I guess it wouldn’t hurt to allow the info to be provided.

It’s off-topic for this conversation (it’s Launcher-specific), but basically it’s for when something says, “I know the user’s intent, so stop looking for more environment/interpreter,” like when a virtual environment is already activated or Poetry has already created an environment for the directory (i.e. py with no version arguments).

:+1:

I could see this info being useful if this started to be shipped in some JSON file with interpreters, so I’ll try adding it.

The proposal already does this: .executable.run (using jq notation) is an array so that you can specify any required arguments to the interpreter and then append arguments to be passed to the running code. See the comment about conda run as a motivating example (e.g. ["/path/to/conda", "run", "--path", "<environment>", "--live-stream"])

Yes. :grin:

1 Like

I have updated the example based on people’s feedback. One open question I have is whether people think bitness is worth keeping separate from CPU architecture?

1 Like

Hmm, yeah micro version is trivial to get from the implementation, and I see that you now marked type as no longer optional. Environment name is still a bit of a problem, since we’d have to include hacky, fragile and limited heuristics based on the path (same as we do now) to handle this, since AFAIK there isn’t a way to reliably query this from within Python. But that’s not so bad, really, to have a best-effort value for.

Okay, gotcha—just to clarify, having an environment activated or somewhere on the path of the calling editor/IDE (Spyder is itself written in Python, and can run from a variety of environments, but we still need the info on all the user’s envs) isn’t going to interfere with the information above being provided about the others, right?

Thanks! At least for us, no, though I suppose theoretically it could be useful to tools interested in low-level details of things like word sizes that don’t want to parse a potentially arbitrary architecture string to infer this. But they might also infer incorrect things here, given different OSes/platforms do different things (e.g. Windows vs. *nix with integer lengths)

Wouldn’t have thought so. It will only tempt people to encode choices based on something that’s probably wrong. I can’t think of anything that would only depend on machine word size but not actual platform on my own machine.

We do need a better way to expose the expected/emulated platform from inside CPython, because right now it doesn’t. Just providing what packaging.tags would generate by default might be a good option?

You can get it from pyvenv.cfg, but I also don’t view that information as critical to using an environment as much as a nice-to-have (we have found giving users the stem of the directory that a virtual environment is contained in works well enough when a prompt was not specified).

Not the way I have things in my head. There’s a --list option to the Python Launcher for this sort of thing. Once again, you’re getting ahead of me. :smile: Right now I’m just focusing on the data aspect and not on the plug-in/finder/searcher/discovery aspect.

OK, it sounds like no one can think of a reason to break it out if we include CPU architecture, so I’ll just make it a single thing.

I think what you’re after I have discussed in What information is needed to choose the right dependency (file) for a platform? . I didn’t think about including the relevant platform info here to create a lock file from a set of requirements, but if we view this as a data format for exchanging interpreter/environment info then I can see it making sense to include it here.

1 Like

As far as I’m aware, Conda and (AFAIK) most/all non-virtualenv/venv environments don’t have pyvenv.cfg, but yeah like you saw, its a best-effort for us and we handle it pretty similarly as you.

1 Like

I actually keep flip-flopping on this. I’m now leaning toward making the entire "environment" key optional, but then making both "type" and "name" required if specified.

I think requiring a name for any environment makes it so you won’t feel obligated in coming up with a name on your own.

Correct, but I’m only aware of conda as a non-virtual environment. Plus conda environment often have a name, so they are taken care of. And at worst, the stem of the path to the environment could be used if someone didn’t have an explicit "name" value to use.

After thinking about this overnight I’m not sure if we should pull in the PyPA-specific environment stuff into this as the current data is nicely interpreter- and environment-agnostic. Otherwise we could have an "environment" key that has an object with some well-known, reserved keys (i.e. "markers" and "tags"), but is otherwise open to other keys and arbitrary values for e.g. conda to put what they would want to know. But I don’t know if that’s worth trying to shove into this specifically.

I’m also realizing that there’s an interesting possibility of having this data either exposed in the stdlib or having it written out to a known location in a file. It would probably have to be written post-install to get the paths to be accurate (at least for those scenarios where that makes sense to have a hard-coded path), but otherwise the stuff that isn’t Launcher-specific is rather static. But if the data be dynamic from an interpreter perspective due to paths and wanting to avoid installation headaches, I could imagine venv writing out an environment.json file next to pyvenv.cfg containing all of these details. That would have the nice side-effect of never having to check if its Scripts/ or bin/ ever again. :wink:

That seems to make sense, assuming its present on a best-effort basis when it is applicable.

Oh, I wasn’t sure if Poetry had its own system (but I guess that’s just on top of venv?) and I assume Spack and maybe others do too—conda and venv/virtualenv are the only ones I have personal experience with, though, and I’m assuming isolation mechanisms like Docker, Snaps/Flatpaks, Nix, etc. are outside the scope of what you intend “environment” to capture.

Yeah, Conda envs always have a unique name (for a given *conda install), and you can pretty reliably find it with something like \1 in /envs/([^/]+)/ else base (assuming you know its a conda env to begin with); I assumed there were others outside of that but I guess not really.

Correct. All the Python-specific environments that I’m aware of are venv-based or conda.

Correct. Those aren’t Python concepts, so I don’'t view them as important to capture for exposure to a user or tooling when it comes to selecting a Python environment in an editor/IDE.

1 Like

Since this has calmed down, I have tweeted about this to get general feedback from the community.

Hi Brett,

In packaging automation at $EMPLOYER, we gather the following info from interpreters and environments:

  • Whether or not a pip installation is available
  • pip.__version__ or None
  • Whether or not a setuptools installation is available
  • setuptools.__version__ or None
  • sys.executable
  • Whatever keys and values packaging.markers.Marker.evaluate() needs to evaluate environment markers for the environment. We currently get these by running a script in the environment itself, and collect them similar like how is described in the table in PEP 508.

Hope this helps!

Pip and other packaging tools is a little problematic as that’s very tools-specific and wouldn’t map to e.g. conda.

I think that’s implicitly covered by the “executable” section of the data.

That’s covered by a separate discussion in What information is needed to choose the right dependency (file) for a platform? . Much the like the pip and setuptools details, it’s very tool-specific.

It does, thanks!

Thanks to everyone who provided feedback! The reception here and on Twitter has been positive enough to execute on Support a git-style plug-in system · Discussion #168 · brettcannon/python-launcher · GitHub for both the Python Launcher for Unix and VS Code. No timelines/ETAs obviously. :grin:

1 Like