Revisiting PEP 505

MegaIng · January 19, 2025, 8:18am

Someone needs to make a new PEP for just ?? and ??=
Someone needs to make a new thread, summarizing previous discussions
This new thread needs to
- be decently moderated with liberal use of the off-topic flag for a chance of the thread not going off the rails. Specifically discussion of ?. and ?[] needs to be mostly prevented.
- have a dead line. 3-6 months is probably fine.
At this point, if it’s clearly a negative discussion result, the PEP should be withdrawn and IMO the topic is done for the next few python versions.
otherwise if it’s unclear or positive, submit to SC and they have to make the final decision - that’s their primary job.

This would break the cycle.

The biggest issue is that noone has jumped up and decided to do it as the PEP author - me inclusive.

I am not going to comment on your argument itself, it’s off topic for this thread and is going to invite arguments. (And I reported the two posts before as off topic and I am going to report future posts in this topic as off topic if they start talking about the proposals content)

hlovatt · January 19, 2025, 11:33am

Part of the purpose of these discussions is to hear views and ‘gauge the temperature of the room’. If people keep saying they don’t like this for reasons X, Y, and Z, then that is as valid as people saying they like it for A, B, and C.

The other thread; seemed to have the same few people repeating the same points many times, on both sides. My intention was to add a new voice.

This thread seems to be based on the assumption that the proposal should go forward. That seems an invalid premise since there is plenty of opposition. So why should it go forward?

I think someone has suggested closing this thread, +1 from me on closing.

PS I have read the whole of both threads.

MegaIng · January 19, 2025, 12:12pm

No, it is not. This thread is based on the question “How can we ever come to a conclusion in either direction”? Arguments for or against the proposal are off-topic in this thread.

mikeshardmind · January 20, 2025, 1:50am

I think we’re at a point where we somewhat need a definitive answer for the community that has some amount of finality to put it to rest. The discussion isn’t going anywhere, the same points are being retread, most of the time by people who haven’t addressed that their point already was addressed before in the thread, and some of those even admitting to having not read the prior discussion.

It’s at the point where one can legitimately argue that any outcome to this might have just tired out the proponents/detractors too much to argue their point and that more the discussion has dragged on, the less it has accurately reflected community opinion as a result.

hlovatt · January 20, 2025, 5:20am

@Rosuav on the other thread, PEP 505 is stuck in a circle, said that the declarative style of parsing was too onerous because he was only interested in a sub-set of the data. But the sophisticated declarative frameworks like Welcome to Pydantic - Pydantic allow for this. However, maybe Pedantic is too complicated and something simpler, for this use case, could by added to Python instead of .? etc. EG:

# Simple framework as an alternative to `o.?f` and such like.
from dataclasses import dataclass
from typing import Any, get_args, Type, TypeVar

T = TypeVar('T')

@dataclass
class SafeData:  # Could be part of Python, maybe a class or decorator in dataclasses.
    def __post_init__(self):
        for n, t in self.__annotations__.items():
            a = getattr(self, n)
            t0 = get_args(t)[0]  # Minimal expression, real version would have to be better!
            if a is None:
                if issubclass(t0, SafeData):
                    setattr(self, n, t0())
            elif not isinstance(a, t0):
                setattr(self, n, t0.ignoring_extras(a))

    @classmethod
    def ignoring_extras(cls: Type[T], d: dict[str, Any] | None) -> T:
        if d is None:
            return cls()
        return cls(**{k: d[k] for k in cls.__annotations__.keys() if k in d})

# Example of using the framework by way of some tricky tests!

@dataclass
class F(SafeData):
    f: int | None = None

@dataclass
class Safe(SafeData):  # Makes safe an external source, like JSON, that might have items missing.
    o: F | None = None

tests = [  # Examples of data from an external source that might have bits missing or extra bits.
    {"o": {"f": 0}},
    {"o": {}},
    {"o": None},
    {},
    None,
    {"not_o": {"f": 0}},
    {"o": {"not_f": 0}},
]

for test in tests:
    safe = Safe.ignoring_extras(test)
    print(safe.o.f)

Which produces:

0
None
None
None
None
None
None

The above is easy to add to Python and achieves the goal of making partial traversal of data easy, with no need for new syntax!

hprodh · January 20, 2025, 9:13am

I see the subject is diverging and close to be closed…

Yet while my last post had a testing purpose, it did lack context and conclusion, I am completing it with this present post. (I am not considering JSON usage, my use-cases of None-checks are mainly performed within class constructors taking different optional arguments.)

Question

What I was trying to assess is “Would it be possible to process every kind of complex expressions performing None checks in different manners as logical operators just with one container for which these logical operations work consistently.”

The answer is yes, but trying to instanciate these obect containers to perform operations on, you also need the reverse “containing” operation (some ‘get’) function.

Example

Let say you want to perform this :

def select_safe(a, b1, b2, c):
    if a is not None:
        return a
    elif b1 is not None and b2 is not None:
        return (b1, b2)
    elif c is not None:
        if c[d] is not None:
            if c[d][e] is not None:
                return c[d][e]
    return None

selected = select_safe(a, b1, b2, c)

Minimal pseudo-code

You might want to write the operations resulting in selected object in pseudo-code as

# pseudo-code ('|' stands for 'or', '&' for 'and', '@' for 'at key/index')
selected = ?{a | b1 & b2 | c @ d @ e}

(Notice that I did not use conventional parenthesis (), but braces {}, because this is not a def, there are multiple separators between arguments (|, &, @), while def takes a unique separator (,), because a pre-processing is required on each argument before the operators execution)

Minimal pseudo-code implementation possibilities in current python

Taking that pseudocode as a basis, it is possible to write its computation in several ways by defining proper functions in current python (I skip writing the functions) :
(nesting functions):

selected = safe_or(a, safe_or(safe_and(b1, b2), safe_at(safe_at(c, d), e)))

(passing references to operators as arguments):

selected = select_safe_parsing(a, '|', b1, '&', b2, '|', c, '@', d, '@', e)

(with chained constructors):

selected = select_safe_construct(a).or(b1).and(b2).or(c).at(d).at(e)

(instanciating the containers):

selected = (Ω(a) | Ω(b1) & Ω(b2) | Ω(c) @ Ω(d) @ Ω(e)).get()

(Notice two things in this case : There is no way to obviate the usage of some get here. The Ω calls could be removed excepted the first one, by carefully leveraging dynamic typing.)

Necessity of xor and xor_index retrieval

Finally, I think all of this would not really be useful without a mutex (mutually-exclusive) check in real-case scenarios. However, you have a mutex by introducing the xor operator, but the real value you want is the index of which part is not None within the succession of expressions between the xor operators. You might write this in pseudo-code as

#pseudo-code, mutex index ('^' stand for 'xor')
selected_xor_index = ?xor_index{a ^ b1 & b2 ^ c @ d @ e}

(selected_xor_index will be 0 if a is defined, or 1 if b1 and b2 are defined, or 2 if c.d.e is defined)
Assuming this is valid python code, you obtain the following usage possibility for processing based on optional parameters :

match ?xor_index{a ^ b1 & b2 ^ c @ d @ e}:
    case 0 : value = process_a(a)
    case 1 : value = process_b(b1, b2)
    case 2 : value = process_c(c.d.e)

Conclusion

It is possible to manage a wide generalization of combinations of None-checks by implementing four operators (‘or’, ‘and’, ‘at’, ‘xor’) in a consistent (and possibly one-lined) way. Some kind of ‘metafunction’ (function that selects sub-function based on multiple possible separators) would directly provide this possibility. You also need a way to return the xor index to fully benefit from the None-checks combination.

pf_moore · January 20, 2025, 11:27am

I got very lost with the various bits of notation you introduced, but for what it’s worth, the select_safe function you started with was far more readable than any of the subsequent examples (even if I do my best to take into account the “new syntax will take time to learn” problem…)

hprodh · January 20, 2025, 12:57pm

Thank you for your feedback.
This post is meant to abstract syntax, I present the pseudo-code as the minimal structure you want to implement, and I provide optional examples on one hand, and demonstrate the necessity of xor operator on the other hand.
I’ve reworked the structure of my post and added bullet points to make it (I hope) more understandable.

pf_moore · January 20, 2025, 1:07pm

Thanks, but you’re missing my point which is that the select_safe function that checks values with explicit conditions as needed is easier to read than an information-dense expression that uses operators that don’t really have any intuitive interpretation.

hprodh · January 20, 2025, 3:39pm

I agree this is dense, this is actually the point and the aim of the PEP, reducing operations to their core, so that we only keep the essential.

On the intuitive aspect, I will try to point something in the following.

From the select_safe function, one might actually observe that the value

selected is not None

can be obtained as :

a is not None or (b1 is not None and b2 is not None) or (c is not None and c.d is not None and c.d.e is not None)

which you might want to write like (considering ? as equivalent to is not None for now)

a? or (b1? and b2?) or c? @ d? @ e?

If you get this intuition, you are almost at it. But now you have to realize that this is not enough to obtain selected with such a one liner. It requires to implement a tracing, which is totally doable, but the instances with a ? must implement the tracing through the logical operators. Once this is done, it is possible to make the tracing back, for example through the get_safe() function, you can obtain selected as

selected = get_safe( a? or (b1? and b2?) or c? @ d? @ e? )

There are many ways to write the syntax of course. The point is that this is a very dense rewriting of the safe_select function that allows maximal versatility within a unified syntax…
There is another way around :

selected = ?{ a or b1 and b2 or c at d at e}

And a third and last way around (akin to the OP) is to directly provide the safe operators, e.g. or?, and?, at?, xor? (a bit more verbose than the OP), so that you have

selected = a or? (b1 and? b2) or? c at? d at? e

In my opinion, the three latter expressions are demonstrating the three most possibly intuitive ways of performing “full-powered-None-check-combinations”.

pf_moore · January 20, 2025, 4:10pm

I’m extremely familiar with Boolean algebra, but thanks for working it through. This is the fundamental reason that we never get consensus, though - some people (yourself included) think that a terse expression using punctuation is readable, whereas others think that it’s not. I’m personally ambivalent - IMO, terseness can be more readable, but excessive use of punctuation-style operators is hard for people to understand^[1]. The only solution that stands any chance of reaching consensus is a compromise. And it’s hard to get people to compromise.

Personally, I find that the need to repeat “is not None” over and over is very distracting, and makes it hard to see the underlying logic. But the problem is that “considering ? as equivalent to is not None”, as you used in some of your examples, only solves the problem when None is the only case you need to consider. And unfortunately, it isn’t - the same patterns come up when looking at “evaluates to False”, or “is not empty”, etc., etc. So that’s the first problem - if we decouple the logic from the check being done, there’s much less reason to assume that testing for None is the special case - and in actual fact, the language already has a complete set of logical operators that work on truth values. So testing for None is clearly not the obvious case, it’s a second possibility, and as soon as you allow for a second possibility, you open up the field for even more - emptiness of collections, values being included in a container, etc.

If I cared enough about logical operations based around checking for None, I could write

def nn(val):
    return val is not None

and use it as

nn(a) or (nn(b1) and nn(b2)) or (nn(c) and nn(c.d) and nn(c.d.e))

Still a little awkward to read, but logically equivalent to your ? while being possible right now. The fact that no-one bothers to do this suggests that testing for None might not actually be as big of an issue as people claim…

If we try to approach the problem from the other angle, we might look at what specific use cases there are. That’s the approach PEP 505 takes, introducing specific operations for cases that the PEP authors consider important. This doesn’t result in a complete set of logical operations, but that’s by design - the operations need to be justified based on usefulness, not on logical completeness. Unfortunately, PEP 505 has so far failed to argue the case for specific operations - the “None-aware indexing” operations are controversial, and there’s no consensus on whether they should cover “key is not present” as well as “key is present, but value is None”. The slimmed down “pick the first non-None value” and “assign if not None” operations seem like the least controversial, but (a) no-one has yet made them into a standalone proposal, and (b) there’s a real possibility that on their own they may not add enough value to be worth having.

Feel free to continue arguing for your logical operators if you want, but be aware that you’re not going to convince me. So I probably won’t respond further. And you’re not actually introducing anything particularly new to the discussion, nor are you really addressing the objections that people have to PEP 505.

and I say that as someone who trained as a mathematician ↩︎

hlovatt · January 20, 2025, 11:01pm

Something like @hprodh proposed I think is a better solution. Below is my version (contribution is definition of operators and typing).

class NL[T]:
    """None Logic (NL) is a boolean logic class; `None` is false, anything else true.

    Python treats `None` as false but many Python classes have other false criteria,
    e.g. `int` treats `0` as false also.
    This class is different, it is a wrapper (a.k.a. box) around any type,
    if it's value `v` is `None` it is false, else it is true.
    IE going back to the `int` of `0` example, `NL(0)` is true.

    Operators `+` (or like), `*` (and like), and `@` (xor like) are defined that return an `NL[T]`.

      * Operation `l + r` returns `l` if `l.v` is not `None`, else `r`.
      * Operation `l * r` returns `l` if both `l.v and r.v` are equal, else `NL(cast(T, None))`.
      * Operation `l @ r` returns `l` if `l.v` is not `None` and `r.v` is,
        else `r` if `r.v` is not `None` and `l.v` is, else `NL(cast(T, None))`.

    Operators `+`, `*`, and `@` are defined so that
    one argument, either left or right, is of type `NL[T] | T | None`,
    the other argument is `self` and hence is of type `NL[T]`.
    The advantage is mixed expressions like `0 + NL(1)` are allowed and are thus more compact
    (the example returns `NL(0)`).

    Some other languages define an operator like `??` that acts like `NL`'s `+`;
    this solution is superior because it doesn't require special syntax and
    has a full set of operators.
    """

    def __init__(self, v: Self | T | None):
        self.v = v.v if isinstance(v, NL) else v

    def __repr__(self) -> str:
        return f'NL({repr(self.v)})'

    def __str__(self) -> str:
        return str(self.v)

    def __bool__(self) -> bool:
        return self.v is not None

    def __add__(self, r: Self | T | None) -> Self:
        if not isinstance(r, NL):
            r = NL(r)
        return self if self else r

    def __radd__(self, l: Self | T | None) -> Self:
        if not isinstance(l, NL):
            l = NL(l)
        return l if l else self

print('\n`NL` tests')
# Note `cast` necessary for typechecker so that `NL`'s `T` is of type `int`.
none = NL(cast(int, None))
print(f'{str(none)=}')
print(f'{none=}')
print(f'{bool(none)=}')
zero = NL(0)
print(f'{zero=}')
print(f'{bool(zero)=}')
print(f'{none + zero=}')
print(f'{none + 1=}')
print(f'{none + None=}')
print(f'{2 + none + 1=}')
print(f'{None + none + None=}')

Which produces:

str(none)='None'
none=NL(None)
bool(none)=False
zero=NL(0)
bool(zero)=True
none + zero=NL(0)
none + 1=NL(1)
none + None=NL(None)
2 + none + 1=NL(2)
None + none + None=NL(None)

I think something like this is a better solution than ??, ?and, ?or, and ?xor.

To keep post short I didn’t show implementation of * and @ but I did define them and show +, so it’s obvious how they are implemented.

ntessore · January 20, 2025, 11:13pm

A point that I haven’t seen mentioned in this thread^[1] is that terseness is more often an advantage, IMO, in comprehensions. And comprehensions can be a readability win, again IMO, over explicitly building lists. I don’t think there’s a nicer way to write these in long form^[2]:

a = [f(x ?? x0) for x in things]
b = [f(x ?? x0, y ?? y0) for x in first for y in second]

This is also suggestive of another syntax, which I am aware isn’t even on the table:

c = [f(x, y) for x? in first for y? in second]  # ignore me

I think the problem here is nuance: like every construct, there are good ways to use it and bad ways to use it. Trying to shoot down examples of one with examples of the other doesn’t lead anywhere. One thing that we should be realistic about, though, is that None is a very special case, and it’s fair to treat it as such.

Apologies if I missed it ↩︎
I’m including list, map, filter here. ↩︎

methane · January 21, 2025, 1:50am

Python programmers may also use other programming languages.

Unless there is a particularly significant advantage, I believe that many Python programmers would benefit from adopting the same syntax as JavaScript/TypeScript rather than inventing their own Python way of doing things.

So, I’m +1 on PEP 505.

P.s. PEP505 should reflect the latest JavaScript status.

JadenCorr · January 21, 2025, 3:07am

I would spell it like “hard to comprehand”. It’s similar to regular expressions.

I understand each character there… But I’m not sure what exactly is going on without deep breath and whirling it around in my head

hlovatt · January 21, 2025, 3:40am

Using the class NL I posted above your example would become:

a = [f(x + x0) for x in things]

Which if anything seems better. x0 is of type NL. x can or cannot be an NL.

What is the advantage of the new syntax over the NL class?

hlovatt · January 21, 2025, 3:48am

This is also straightforward:

c = [f(x, y) for x in first for y in second if NL(x) and NL(y)]

Assuming x and y aren’t already NLs, if they are then no need to construct NLs.

smurfix · January 21, 2025, 6:08am

Owch. I’d like addition to please work like addition. NL(1)+2 == 1 seems very counter-intuitive,

As NL has a working __bool__ method: why not simply keep to && and ||, and delegate all the usual dunders to the underlying value?

Wombat · January 21, 2025, 6:42am

That’s a good example and very common. But personally, I prefer the current way to do it.

if x is None:
    x = []

vs

x ??= []

The proposed way is much less readable and it doesn’t really save much code.

It seems to me that the soul of Python and its main attraction is avoiding this kind of thing. My vote is to not do this.

smurfix · January 21, 2025, 7:12am

That depends. Things like multi-level indices or attributes that might either be missing or return None are really verbose right now. Also, they suffer from unnecessary new objects (s=foo.get("bar",{}).get("baz", None); if s is None: s = default_object_maker()) and cannot be used in expressions – see the walrus operator why you’d want to be able to have that.

Thus I’d really like a ?? shortcut operator. Semantics: it evaluates its left-hand side and calls the result’s __none__ dunder. If that returns anything but None, that’s the result; otherwise evaluate+return the right-hand side. Left associative of course. The default implementation object.__none__ would simply return self.

That way we could easily write a rather simple NL class that mimcs all those ?. and ?[] sigils which nobody wants, can act as a neutral element in addition and multiplication, and which auto-resolves as soon as you use ??.

Bonus: This would solve the thorny issues around ?.and friends (do they raise AttributeError when their left-hand side is not None, or do they return None too?): just use a variant of your NL class that does what you need it to do. No need for special syntax.