The Language Reference explicitly calls out that repeated evaluation of literals
may or may not produce the same object
(§6.2.3.1 “Literals and object identity”):
Multiple evaluations of literals with the same value (either the same
occurrence in the program text or a different occurrence) may obtain the
same object or a different object with the same value.
I’m wondering whether we should make a similarly-explicit statement for repeated
evaluation of calls (and, maybe more broadly, expressions other than literals).
I wasn’t sure where on discuss.python.org this fit (Documentation vs Core Development),
but I’m posting in Core Development because it’s about language-reference wording
and what freedoms different Python implementations should have.
Right now §6.3.4 “Calls” only says a call returns some value
(§6.3.4 “Calls”):
A call always returns some value, possibly None, unless it raises an
exception.
This comes up in practice because object identity can be observably relevant for
membership tests, especially for NaN-like values that are not equal to themselves.
For example:
nan1 = float('nan')
s = {nan1}
nan2 = float('nan')
print(nan1 is nan2) # CPython: False
print(nan1 in s) # CPython: True (same object, so `is` succeeds)
print(nan2 in s) # CPython: False (NaN breaks ==/hash consistency)
CPython’s PyFloat_FromDouble
floatobject.c#L125-L137
returns a new float object for each call, so nan1 is nan2 is False.
If an implementation ever cached/canonicalized float('nan'), then nan1 is nan2
could become True, and nan2 in s would become True even though nan2 == nan1
is False. The reference’s membership-test description uses an is check as part
of an equivalence
(§6.10.2 “Membership test operations”).
For sets/dicts this is extra subtle because membership also depends on hashing:
for NaNs (and other non-reflexive or “inconsistent” values), the common mental
model “membership is just ==” breaks down. (This seems to be one reason the
current wording is controversial; see prior art below.)
What I’d like feedback on:
- Would it be helpful to add a short note near §6.3.4 “Calls” saying that the
language does not specify object identity relationships between results of
separate call evaluations (unless an API documents stronger guarantees)?
If yes, would wording like this be acceptable?
The language does not specify whether separate evaluations of a call (or other
expression) that produce immutable objects will return the same object or
distinct objects with equal value. Programs must not rely on object identity
except where it is explicitly documented (for example, for singletons).
Why I’m not proposing “calls always produce distinct objects”:
- Many callables legitimately return existing objects (e.g.,
int(3),bool(x),
sys.intern(s), identity-preserving conversions, or factory/singleton patterns). - For performance, implementations often cache or reuse objects.
Alternatives considered:
- Status quo (don’t mention call identity) — but then the “literals have
unspecified identity” section can look like a special case rather than an
instance of a broader “don’t rely on identity” rule. - Adjust the membership-test wording — if the intent is to describe the
result rather than a specific evaluation strategy, consider rephrasing the
any(x is e or x == e ...)equivalence to avoid implying a universal algorithm
across container types (and to better acknowledge the role of hashing for
sets/dicts).
Meta / disclosure:
This post is part of a research project about how technical discussions are
received and iterated on. If you’re willing, I’d appreciate you filling out a
short questionnaire about this post (e.g., title, body, and format, but not limited to those). The questionnaire
hasn’t been created yet; it will be created when this discussion concludes:
(Tally is a survey tool similar to Typeform.)
Prior art / related discussions:
bpo-45832: “Misleading membership expression documentation” (the claim that
x in yis “equivalent to”any(x is e or x == e ...)is demonstrably false for
some types due to hashing/equality interactions).
https://bugs.python.org/issue45832- discuss.python.org: “Making NaN a singleton” (motivations for/against NaN
canonicalization and why relying on identity for NaN is problematic).
Making NaN a singleton - discuss.python.org: “Does set membership use a different test than list membership”
(ongoing discussion about the membership-test wording and hashing).
Does set membership use a different test than list membership PEP 754: IEEE 754 floating point special values (historical context for NaN
handling in Python).
PEP 754 – IEEE 754 Floating Point Special Values | peps.python.org