Clarify object identity semantics for repeated evaluations — literals vs. function calls

The Language Reference explicitly calls out that repeated evaluation of literals
may or may not produce the same object
(§6.2.3.1 “Literals and object identity”):

Multiple evaluations of literals with the same value (either the same
occurrence in the program text or a different occurrence) may obtain the
same object or a different object with the same value.

I’m wondering whether we should make a similarly-explicit statement for repeated
evaluation of calls (and, maybe more broadly, expressions other than literals).

I wasn’t sure where on discuss.python.org this fit (Documentation vs Core Development),
but I’m posting in Core Development because it’s about language-reference wording
and what freedoms different Python implementations should have.

Right now §6.3.4 “Calls” only says a call returns some value
(§6.3.4 “Calls”):

A call always returns some value, possibly None, unless it raises an
exception.

This comes up in practice because object identity can be observably relevant for
membership tests, especially for NaN-like values that are not equal to themselves.
For example:

nan1 = float('nan')
s = {nan1}

nan2 = float('nan')

print(nan1 is nan2)   # CPython: False
print(nan1 in s)      # CPython: True  (same object, so `is` succeeds)
print(nan2 in s)      # CPython: False (NaN breaks ==/hash consistency)

CPython’s PyFloat_FromDouble
floatobject.c#L125-L137
returns a new float object for each call, so nan1 is nan2 is False.

If an implementation ever cached/canonicalized float('nan'), then nan1 is nan2
could become True, and nan2 in s would become True even though nan2 == nan1
is False. The reference’s membership-test description uses an is check as part
of an equivalence
(§6.10.2 “Membership test operations”).

For sets/dicts this is extra subtle because membership also depends on hashing:
for NaNs (and other non-reflexive or “inconsistent” values), the common mental
model “membership is just ==” breaks down. (This seems to be one reason the
current wording is controversial; see prior art below.)

What I’d like feedback on:

  • Would it be helpful to add a short note near §6.3.4 “Calls” saying that the
    language does not specify object identity relationships between results of
    separate call evaluations (unless an API documents stronger guarantees)?

If yes, would wording like this be acceptable?

The language does not specify whether separate evaluations of a call (or other
expression) that produce immutable objects will return the same object or
distinct objects with equal value. Programs must not rely on object identity
except where it is explicitly documented (for example, for singletons).

Why I’m not proposing “calls always produce distinct objects”:

  • Many callables legitimately return existing objects (e.g., int(3), bool(x),
    sys.intern(s), identity-preserving conversions, or factory/singleton patterns).
  • For performance, implementations often cache or reuse objects.

Alternatives considered:

  1. Status quo (don’t mention call identity) — but then the “literals have
    unspecified identity” section can look like a special case rather than an
    instance of a broader “don’t rely on identity” rule.
  2. Adjust the membership-test wording — if the intent is to describe the
    result rather than a specific evaluation strategy, consider rephrasing the
    any(x is e or x == e ...) equivalence to avoid implying a universal algorithm
    across container types (and to better acknowledge the role of hashing for
    sets/dicts).

Meta / disclosure:

This post is part of a research project about how technical discussions are
received and iterated on. If you’re willing, I’d appreciate you filling out a
short questionnaire about this post (e.g., title, body, and format, but not limited to those). The questionnaire
hasn’t been created yet; it will be created when this discussion concludes:

(Tally is a survey tool similar to Typeform.)

Prior art / related discussions:

1 Like

Just FYI, the fact that this is “part of a research project” severely reduces my interest in participating. IDK what kind of information you are going to draw from that.


IMO, while pointing out that float("nan") may or may not be the same object is not a bad idea, the general language reference for “call” is not the correct place to point this out. - to the point where I am confused why you would think it’s a proper place to do this.

The reason that literals are singled out is because they often do have the same identity as long as it’s the same codeblock.

The fact that two calls may result in different objects is very easy to test. Yes, there are a few counterexamples, but that is a specific property of those data types. So this information belongs in the docs for that particular constructor.

1 Like

In general, CPython’s C APIs are not defined by the Python language specification (only inferred by being the reference implementation).

People receive technical discussions poorly when you have an ulterior motive.

4 Likes

You’re right that §6.3.4 “Calls” is the wrong spot. I think I was too focused on the specific example (float('nan')) rather than the broader pattern. Let me reframe:

The core problem isn’t about calls specifically — it’s that §6.10.2 defines membership tests using an any(x is e or x == e ...) equivalence that makes containment observably depend on implementation-specific identity. For NaN (and other non-reflexive values), this means nan in {nan} can give different answers depending on whether the implementation happens to return the same object or a different one. That’s a spec issue regardless of how calls behave.

bpo-45832 already flags that this equivalence is incorrect for sets/dicts (hashing changes the evaluation order and semantics). So maybe the fix should be there instead.

actually, §6.2.3.1 says the opposite — it warns that literals may or may not be the same object. The section exists precisely because programmers see {} is {}False and might start relying on that CPython behavior. The reason it’s worth calling out is that the default assumption (each literal evaluation creates a new object) isn’t guaranteed.

Calls have the same property — float('nan') is float('nan') being False in CPython is an implementation detail. The spec says nothing about whether two calls can return the same object. I agree this is easy to test, but so is {} is {}.

FWIW, the reason I went looking at the spec in the first place: I’m writing code that targets multiple implementations (CPython, PyPy, and others). I wanted to know what I could rely on across implementations, so I checked the language reference, and it wasn’t clear. whilst I’m using the spec as a reference, and researching at how that’s implemented in different places, I do want to make sure I have my mental model clear

The section you’re linking to is the language reference. That is specifications mainly aimed at implenters of language tooling or alternate implementations where clear, concise semantics are the most important goal.

This documentation is more or less aimed at language-lawyers and with that level of understanding in mind it’s clear that we need to specify what literal expressions guarantee or not as that is inherently handled by the language while a function simply always returns whatever object it returns (as it’s written relying on the lower-level guarantees) and no more general assumptions can be made.

So for the literal/call discussion I would say the documentation precisely as clear as it should and needs to be.

Regarding the membership test operations I’d say the same. It’s technical and subtle and for sure will lead to misunderstandings unless read with very precise eyes which few programmers do, however again, this documentation is explicitly for those people.

There may very well be room for hints pointing out the sublteties of this in some other sections of documentation but the reference manual is probably better left as is.

I do echo the concerns about this being a research-project as well

I included the research disclosure because I thought transparency was the right thing to do (some discussion platforms require it). Clearly that was a mistake, I’m going to remove it from the body.

To be concrete about what the project actually is: it’s about me — whether I’m including enough context in my posts, and if not, how I can improve.

{} isn’t a literal, it’s a collection definition. If the docs are calling it a literal, then that’s incorrect and should be fixed. [] is [] is also False, and the only reason () is () is True (with a warning) is because of an internal optimisation and not because of the language specification.

Equally, float('nan') also isn’t a literal. Whether a function call like that returns the same object or different object for two separate calls is totally independent of the language. Whether calling float returns the same or different objects for separate calls is defined by the float type.

nan is a terrible example here because it has terrible behaviour, but every sensibly defined object will have __eq__() => True when called against itself, which means the x is e part of the check is just an optimisation (and probably is only mentioned because someone came in with the same intent as yourself and insisted on overspecifying details here).

I’m pretty sure the entire computer science world would gladly go back and change the spec that defined NaN as not equal to itself, but we can’t. But we’re not changing otherwise sensible specifications just to deal with it.

3 Likes

No, keep it. At very least, be honest.

I decline to participate in a discussion that has a primary purpose of learning about how discussions work. You can look through the history of this forum for some actual productive discussions and learn from those.

3 Likes

I’m in exactly that position, I’m wanting to write code targeting multiple implementations (CPython, PyPy, and others), and I help maintain Nuitka, a Python-to-C compiler. So I am the audience for this spec.

On the membership test wording though, I push back a little. You say it’s technical and subtle but correct: I don’t think it is correct. The any(x is e or x == e for e in y) equivalence has two demonstrable issues.

First, when a == b but hash(a) != hash(b), the equivalence is false for sets and dicts. This was already filed as bpo-45832 in 2021. It was closed as a third-party issue blaming pandas, but even Raymond Hettinger’s suggested fix, splitting the wording for sequence vs hash-based containers, acknowledged the spec could be more precise.

Second, NaN is not equal to itself. The is short-circuit in the equivalence hides an implementation dependency: the spec says nothing about whether two calls to float('nan') return the same object. I ran into this exact ambiguity as a Nuitka maintainer, Nuitka’s constant blob caches float constants, so float('nan') always returns the same object across calls. That makes nan2 in {nan1} return True (matched via is before ==), while CPython returns False. Both behaviors are spec-compliant, but as a compiler writer I had to reverse-engineer CPython internals to decide which to match (Nuitka #3889).

1 Like

you’re right that {} isn’t a literal, that was a mistake in my original post, sorry about that.

I know nan is unusual, but it’s not a theoretical edge case, it’s what triggered this whole discussion. Since Nuitka is an optimizing compiler that constant-folds aggressively as an optimization strategy, I came upon this issue by accident. I tried to refer to the spec, to CPython source code, in order to understand my approach, I couldn’t get a straight answer, so I opened an issue on Nuitka’s part to understand it’s motivation and then started this discussion.

Actually, the primary purpose of the discussion is the language reference question.

I decline to participate in a discussion that has a primary purpose of learning about how discussions work. You can look through the history of this forum for some actual productive discussions and learn from those.

I’m trying to understand how I can improve any potential future discussions with this community. I’m already getting some insight

nan is a terrible example here because it has terrible behaviour, but every sensibly defined object will have __eq__() => True when called against itself, which means the x is e part of the check is just an optimisation (and probably is only mentioned because someone came in with the same intent as yourself and insisted on overspecifying details here).

if I had included context wrt to what I’m doing and how I came upon this, it’d made it easier for others to understand my motivation behind this post

I decline to participate

sure!

edit: specifically my own discussions, my social skills aren’t the best in these spaces.

Why are NaN constants created by Nuitka always "is" identical · Issue #3889 · Nuitka/Nuitka · GitHub the issue in question.

Do you have any other examples?

If it’s just NaN, then update the NaN spec to describe why it deviates. Don’t update the general case for the sole exception.

If there are more examples that can’t be explained away as bugs (in the value/type implementation, not the spec), there might be a clarification to the rule that’s worth making.

I had mentioned bpo-45832 before, but I did some more digging and found a few:

  1. #71792 (2016, still open) — “Inconsistent calls to eq from built-in contains.” Different container types call eq differently during membership tests.

  2. #48546 (2008, release-blocker) — “Python assumes identity implies equivalence; contradicts NaN.” CPython itself had a bug where set/dict lookup assumed x is y implies x == y, which is false for NaN. The very assumption the current any() wording relies on has been a CPython bug before.

  3. mypyc #971 (2023) — mypyc had the same optimization issue as Nuitka: it optimized x in (1, 2) to x == 1 or x == 2, forgot the is check, and had to add it back when users hit cases where __eq__ was state-dependent.

anyways, based on this i’m just going to open a PR and we can continue the discussion there

One of those is NaN-related (and everyone seems to agree that NaN isn’t special enough to define the general rules), and the other two are unresolved and both rely on unusually behaving equality operators.

I recommend studying Raymond’s posts on the first two links, and try really hard to understand where he’s coming from, not just focusing on the practical impact on these specific cases. Figure out why he’s arguing for the position he’s taking, and when you can articulate that (and even figure out how to argue in favour of it), you might understand it well enough to propose changes.

A PR is premature. I’m not going to waste my time looking for it, but if someone else sees it consider me opposed to any change based on this discussion.

I did read his comments and went thru the the whole discussion.

As Raymond says: “The ‘identity implies equality’ rule isn’t just an optimization, it is a deep assumption that pervades the language implementation. Lots of logic relies on it to maintain invariants.”

And further down in that same thread, smarnach said in 2011:

The behaviour discussed in this thread does not seem to be reflected in Python’s documentation. The documentation of eq() doesn’t mention that objects should compare equal to themselves.

It’s probably not worthwhile to “fix” all the places in the documentation that implicitly assume that objects compare equal to themselves, but it probably is a good idea to mention that eq() implementations should fulfil this assumption to avoid strange behaviour when used in combination with standard containers.

Smarnach’s point is exactly what I’m hitting — the docs assume well-behaved types but never say so explicitly. Clearer docs will help people avoid the same issue I just faced, especially since there are multiple Python-to-C projects, Python-to-Rust, and other variants. I think it’s likely new people will need to search for and reference the spec and implementations, and come to find similar issues as I have.

for now, I’ll be proposing a short note at the top of the Expressions chapter that says something like “Code samples labeled ‘equivalent to’ are conceptual, not exact specifications. They assume well-behaved types.” That way the next person who runs into this doesn’t need to start a whole discussion thread to figure out how to read the spec.

personally i don’t think a PR / Issue is premature, i’d rather be proactive and contribute myself directly than wait for a long time for any meaningful changes to be made, i’d rather fix the root cause.

I would say it’s a bug in an implementation, with a high chance. (See also more recent thread https://discuss.python.org/t/question-about-float-nan/106378/.)

The Python doesn’t specify how floats behave. But on practice, float’s usually are IEEE 754 doubles. That means we have a lot (9007199254740990) of different nans. It will be wrong (per standard) to canonicalize them to some single random value (say, as in the NAN C macro).

That’s same for float(‘1.23’). Of course, you are free to cache that value. Same is true for NaN’s, assuming you are properly take into account their signs and payloads, while caching.

But caching nan’s is much more dangerous, as you loose ability to keep multiple nan’s in sets or as dict’s keys. (The float constructor doesn’t specify how you could create different NaN’s, c.f. the Decimal constructor.)

I traced down history for this part of the documentation and it looks that was added by Raymond in Issue 4090 and 4087: Further documentation of comparisons. · python/cpython@a2a08fb · GitHub. This looks for me as leak of CPython implementation details to the specification. Maybe removing of x is e parts is an option?

@MegaIng is right about the intent. It’s talking about literals, but you’re talking mostly about expressions that aren’t literals. This is the kind of thing it’s aiming at:

>>> a = 999
>>> b = 999
>>> a is b
False

>>>  if 1:
>>>     a = 999
>>>     b = 999
>>>     a is b
True

The language doesn’t define either result. The results differ in CPython because literals denoting equal immutable objects are generally recognized as such if, and only if, they appear in the same “code block”. This is compile-time behavior specific to CPython. Each code block has its own table of constants, and that’s what’s searched over (at compile time) when looking to see whether an object built for a constant can be reused.

1 Like

Yes! in this case, the way that Nuitka implemented is a bug because it aims to have full compatibility with CPython specifically.

But caching nan’s is much more dangerous, as you loose ability to keep multiple nan’s in sets or as dict’s keys. (The float constructor doesn’t specify how you could create different NaN’s, c.f. the Decimal constructor.)

Yeah, to be honest, I don’t really deal with NaNs, I only came upon the issue since I was doing fuzzed testing.