Opinions about alternative implementations returning `platform.python_implementation() == "CPython"`?

Hey all

Not sure this is the right category, but I just wanted to note a discussion happening on the issue tracker of the recently open-sourced pyston, regarding what value platform.python_implementation() should return.

While some have argued that this should be "Pyston" (along the lines of e.g. "PyPy"), the Pyston developers seem to prefer to keep using "CPython". I thought some visibility here would be beneficial if people happen to have opinions about this one way or another.

From the documentation:

Possible return values are: ‘CPython’, ‘IronPython’, ‘Jython’, ‘PyPy’.

Arguably, they have to report as CPython, as they are clearly not one of the other possibilities.

Looking at it from a practical point of view, I’d say that if they say they are CPython, they should be prepared to treat reports of behaviour differences as bugs in their implementation (or if they want to say they aren’t, they should open a bug report against the stdlib module asking for “Pyston” to be added to the list of valid values).

I don’t have any knowledge of how often people check platform.python_implementation when choosing different code paths, so I have no feel for how important this is in practice.

1 Like

I suspect the possible return values are documenting the behaviour of the implementation, rather than the specification.

The platform module is not supposed to be used for runtime decision making (that’s what sys.platform is for), but it is meant to be used for logging. If Pyston is running in production, I would definitely want to see that in my logs, so it should return Pyston.

3 Likes

I didn’t know that - is it documented anywhere? If that is the case, then (1) I agree, I’d expect to see “Pyston” in logs, (2) I’d strongly suggest we make the intended usage much clearer in the docs, and (3) we should change the documentation of platform.python_implementation to be clear that other values are allowable.

Personally, none of this affects me, so I’ll leave it to someone else (possibly the OP or the Pyston devs) to sort out any documentation PRs.

1 Like

You’re right, it’s not clearly documented in the platform module docs, though it’s somewhat implied by this line under platform.platform:

The output is intended to be human readable rather than machine parseable. It may look different on different platforms and this is intended.

In the past, we’ve definitely made decisions about returning the Windows version differently between the sys module and the platform module based on this idea (in short, Windows will lie about its version in compatibility modes, and the platform module should “see through” it, while anything testing for expected API behaviour should not).

[Copied here from the referenced Github issue on the Pyston tracker]

Let me add another angle to this discussion :slight_smile:

If you want to add a new return value to the set of defined return values of platform.python_implementation(), you should first open an issue on the Python bug tracker asking for this. At this point, the only allowed return values are documented as ‘CPython’, ‘IronPython’, ‘Jython’, ‘PyPy’ (see platform — Access to underlying platform’s identifying data — Python 3.10.0 documentation

On the topic of where to draw the line, please consider that the current set of return values (except for ‘CPython’) are true reimplementations of the Python spec, not just forks of the reference implementation. If an application needs to have more fine-grained control, the forks should provide other means of identifying the fork’s name, e.g. via special attributes in the sys module or other support modules.

The platform module is generally meant to give users a better idea of what platform Python is currently running on. Most of the API output is intended for user consumption – that’s how I started the module at the time: I needed a way to include enough information about the target platform in a download name for mxCGIPython (which later became PyRun), so that users could figure out the right file to download.

That’s also the reason why the platform module returns marketing version names instead of system versions in many cases.

Applications should stick to system versions for runtime checks.

2 Likes

My personal opinion is that platform.python_implementation() should not return anything other than “CPython” for Pyston, since it’s not a reimplementation of the language in the same way “PyPy”, “IronPython” or “Jython” are.

Perhaps we could add a new API platform.python_variant() for alternative forks of CPython (or the other implementations).

I’d say that documentation is more descriptive (of the available major alternative implementations) rather than normative…

It’s IMO quite a blurry line. As soon as behavioural differences can be observed, they will silently become part of the contract per Hyrum’s law. Both as a user and as a library developer, I’d want to be able to distinguish vanilla CPython from a sufficiently modified fork.

Absolutely. The question here isn’t whether you should be able to do that (of course you should) but rather whether the platform module is how you should do that.

The Pyston documentation clearly says how to check for Pyston at runtime. So “being able to distinguish” isn’t the point here.

I know that the literal reading of the docs says that the only

permitted values are CPython, IronPython, Jython and PyPy. But I find it

impossible to believe that is the intended meaning.

Here are some more implementations of Python which are, as far as I

understand it, are under active development, and are complete

re-implementations, not forks of one of the above:

  • RustPython (Python in Rust)

  • GraalPython (Python 3 in Java, on the Graal VM)

  • MicroPython

  • Brython (Python in Javascript for the browser)

and I have probably missed some.

(I haven’t linked to them all because I think that Discourse does not

like email replies with more than one or two URLs. But the search

engine of your choice will find them.)

MicroPython 1.9.4 does not implement the platform module, so the

question of what it should advertise itself as is moot. I don’t know

what the others do.

I think that forcing, say, RustPython to claim to be CPython would make

a mockery of the whole thing :slight_smile:

To my mind, the only real question here is whether the Python docs are

intended to be an official register of valid implementation names, or

can implementations just pick any name they choose.

Having a definitive list of possible values in the documentation is intended. We can extend the list when new implementations arise, of course. Your list sounds like a good start.

That seems to go against the “for human consumption” purposes. We don’t need to provide a definitive list if the intention is for people to read and interpret a string.

If it is meant to be definitive, why not an enum? (That’s a facetious question, because the answer is, we don’t need a definitive list, just a human-readable result :wink: )

Even humans have some expectations when they call a function :slight_smile:

Seriously, this is more about managing expectations and documenting what we understand as Python implementations and uses of the Python stdlib than anything else. The ones which are currently listed are actually supported by the stdlib.

1 Like

We could manage it with a note saying that “other ports, forks, and implementations of Python may return values not listed here.”

1 Like

True. We don’t have any control over those other implementations anyway. However, if they want to use the Python stdlib, it would be good for the authors of those implementations to register their intent by submitting a PR to extend the list.

But that still doesn’t help Pyston. It’s simply a fork of CPython, not a new implementation.

I don’t find it impossible. Having an explicit list allows people to use the value to choose what to do, and to say “else not supported” with confidence. It’s a perfectly reasonable intention, especially if the docs were written ages ago when the given four really were the only Python implementations around.

However, I’m quite happy to accept that’s not the intended use, and that the docs are wrong to imply that only those values are allowed, but in that case, why shouldn’t we fix the docs? Equally, if the intention is to only allow those explicit values, why not say so, and provide some guidance on what other forks/re-implementations should do?