Linked Booleans Logics (rethinking PEP 505)

hprodh · January 27, 2025, 10:33am

I open this topic to discuss about the code generalizability (use-cases applicability), practicality (efficiency in code length reduction), consistency (caveats), and clarity (readability) of a novel conditional combination method, namely the “linked booleans logics”.

Story : PEP 505 proposes new operators (based on javascript syntax) to ease the ‘coalescing’ of boolean assessment of None objects, by implementing the “Safe or”, “Safe getitem”, “Safe getattr” with the following operators :

a ?? b  # "Safe or" : a if a is not None else b
a ?[ b ]  # "Safe getitem" : a[b] if a is not None and a[b] is not None else None
a ?. b  # "Safe getattr" : a.b if a is not None and a.b is not None else None

This syntax is controversed because of the decreased readability of the code due to the introduction of numerous special characters ?. Additionally, no consensus converges to determine if the use-cases covered by this operators is worth their introduction in the language.

–
The linked boolean logics appears from generalizing the aforementionned operations to equivalent operations on an extended form of boolean, which keep a track of a referenced object (thus “linked” boolean).
The interest does not resides in instanciating the linked booleans but in returning the result of their combination through the linked boolean logics operators.
Thus defining an encasing (let say op{...}) within which the linked boolean operations would be performed as bitwise operators |, &, ^, getitem and getattr as a[b], a.b would provide the following possible syntax :

op{ a | b }  # Safe or
op{ a[b] }   # Safe getitem
op{ a.b }    # Safe getattr

(note that in order for this to work right now, a and b should be linked boolean instances, overriding boolean or, getitem and gettattr with the linked boolean equivalents)

Other functionalities would additionally be provided :

op{ a & b }  # "Safe and" (a,b) if (a is not None and b is not None) else None
op{ a ^ b }  # "Safe xor" a if (a is not None and b None) else b if (a is None and b is not None) else None

One convenient derivative (e.g. in match statement) could also be to return the index of the only element assessing True among a list (like a “mutually exclusive (mutex)” operator) :

op.mutex_idx{ a, b, c} # mutex index
#0 if (a is not None and b is None and c is None) else 1 if (a is None and b is not None and c is None) else 2 if (a is None and b is None and c is not None) else None

–
In the above examples we assumed the default ‘linked boolean function’ is a None-check (lambda x : x is not None). Yet, the linked boolean logic can be extended to a wider domain of use-cases by considering other ‘linked boolean functions’.
Consider, for example, an alternative function which returns False for None or an empty list:

f_el = lambda x: x not in [None, []]

We might override the default ‘linked boolean function’ for the whole operation combinations :

op(f=f_el){ a | b & c }

We might also want to apply the function to an individual linked booleans (thus allowing multiple functions on different elements), for example as :

op{ a | f_el >> b & c }

One last possibly useful thing would be to filter every elements assessing False from an inputted list, (a syntax should be defined for this).

Additional notes :

Reminder on the precedence order of operators (algebraic operator first, then bitwise ones, then or and and then comparative ones). This, as well as the “short-circuiting” behavior of and, makes difficult the usage of the operator and in a consistent implementation right now.

A prototype code implementing and testing the linked boolean operators is available here :
“LinkedBool_1 - Pastebin.com”. (EDIT : misuse of “xor” for ‘mutex’ within the prototype, not corrected)

Paddy3118 · January 27, 2025, 11:18am

Hmm, booleans are True/False, there’s an accommodation for ints, but you seem to be wanting, (in part), to extend boolean logic to handle None. i wonder if this could be thought of, in general, as handling Not-A-Number in numeric systems. I had best results by carefully looking at my input data and working out why the NaN likely occured and making appropriate changes so the cleaned data removed all NaN’s - maybe it was one false reading that could be dropped, or its value assumed from other datapoints.
My point is that if the data is first cleaned then the NaN or None will disappear allowing normal (in this case boolean), processing. That cleaning might also throw exceptions if the input data is too bad.

If you truly want a trinary-based logic system then that is niche, and you probably want to code that per project.

dg-pb · January 27, 2025, 11:43am

hprodh:

op.xor_idx{ a ^ b ^ c} # mutex index
#0 if (a is not None and b is None and c is None) else 1 if (a is None and b is not None and c is None) else 2 if (a is None and b is None and c is not None) else None

This is unnecessary I think. The only benefit of such would be short-circuiting.
However, this definition does not require such.
And also, this is not xor. I call this operation is_one_true (next to any / all).
Actual a ^ b ^ c expansion takes a different form than this.

And for the definition that you have here one could just write a function:

def first_true_idx(seq, mx=1, pred=operator.is_not_none, default=None):
    bits = map(pred, seq)
    assert sum(bits) <= mx
    it = filter(operator.itemgetter(1), enumerate(bits))
    return next(it, (default, None))[0]

I think this is very niche case and although I encounter a lot of similar cases, but from my experience all of them are slightly different and generalisation to such extent is unlikely going to be useful.

hprodh · January 27, 2025, 12:47pm

True, my bad, I did correct it. I call it a mutex for my part.

This is quite analogous, yes, but there are also dedicated functions in, for example, numpy, that are usually doing what you need with NaNs.

This is quite true, but the use-cases I encounter are more about classes methods (initialisers, setters, getters) that take optional arguments. While there is only one option to process if option1 is not None is not a problem, but the more these options are numerous, the more care need to be taken for each of them (also possibly optional parameters for options and exclusive options). At some point the code of the class looks more like a None-management class than anything else.

Maybe some universal syntaxic tool that can solve this niche case can also solve other niche cases, and if the solution generalizes largely enough it might become worth it, and convenient.

effigies · January 27, 2025, 12:58pm

With respect, this mini-language is the kind of difficult-to-understand use of punctuation that people are generally glad that Python has avoided. I don’t think it’s likely to gain traction.

There’s also no need for it. You could write this as a wrapping object and overload all of the dunder methods you like:

class Op:
    def __init__(self, value):
        self.value = value

    def __and__(self, other):
        if self.value is None:
            return self
        return self.__class__(self.value and other)

    ...

result = (Op(A) | f_el >> b & c).value

dg-pb · January 27, 2025, 12:58pm

Regarding your prototype, I think https://pypi.org/project/pymaybe/ has done pretty much that via method chaining, which I like a bit better as one doesn’t need to wrap every object.

(nn(a) | nn(b)).get()
@ vs
(maybe(a) | b).get()   # Not sure if this is implemented in it, but if it was it would look like this

What is not satisfactory about this one and your prototype (according to PEP505) is the necessity to eventually call retrieval method explicitly.

I can think of one way that would avoid this:

|| Expand this for code ||

class mby:
    def __init__(self, obj, null=None):
        self.obj = obj
        self.null = null

    def _preprocess(self, other):
        if isinstance(other, type(self)):
            assert self.null is other.null
            other = other.obj
        return self.obj, other

    def __or__(self, other):
        obj, other = self._preprocess(other)
        return obj if obj is not self.null else other

    def __ror__(self, other):
        obj, other = self._preprocess(other)
        return other if other is not self.null else obj

    def __matmul__(self, other):
        obj, other = self._preprocess(other)
        return getattr(obj, other) if obj is not self.null else obj

    def __rmatmul__(self, other):
        obj, other = self._preprocess(other)
        return getattr(other, obj) if other is not self.null else other

    def __lshift__(self, other):
        obj, other = self._preprocess(other)
        return obj[other] if obj is not self.null else obj

    def __rlshift__(self, other):
        obj, other = self._preprocess(other)
        return other[obj] if other is not self.null else other


import types
print(1 | mby('a'))         # 1
print(None | mby(2))        # 2
print()

obj = types.SimpleNamespace(a=1)
print(obj@mby('a'))         # 1
print(None@mby('a'))        # None
print()

obj = {'a': 1}
print(obj<<mby('a'))        # 1
print(None<<mby('a'))       # None
print()

# A bit more cmplex example
obj = {'a': types.SimpleNamespace(a=1, b=None)}
print((obj<<mby('a'))@mby('a') | mby(2))     # 1
print((obj<<mby('a'))@mby('b') | mby(2))     # 2

But it can already be seen where this road leads back to → PEP505.
It is pretty much the same, except:

inconvenient syntax
but allows for other “None” values

I think 3 operations: coalesce, attr access and item access are sufficient. Other more niche operations can be handled ad-hoc or by specialised function (similar to which_true in my previous post).

And I like PEP505, I think the only missing bit to me is flexibility for alternative “None” values.

If someone was to come up with some ingenious idea how to allow for this flexibility by not damaging its neat syntax it might be a good step forward.

hprodh · January 27, 2025, 1:24pm

True, we need to get into the “linked bool space”, then do the combination (that can optionally be propagative), then get the value back, semantically, everything is equivalent, including PEP505.

Note : I also already thought about some possible syntax using <<, >> as opening and closing and some opener/closer element nn:

nn<< a | b >>nn

funny, but too hacky to put into working code, didn’t want to protoype.

dg-pb · January 28, 2025, 3:10am

I like the concept of enclosed DSL.

However, for such to be justified the benefit would have to be immense as implementation of such would be of non-trivial complexity.

Maybe it is worth exploring what other applications this might be fitted to serve?

As currently, the way I see it, ?? would be most useful and ?. and ?[ combined would hardly amount to usefulness of ??. And for these 3, non-DSL implementation would be both simpler and more convenient.

hprodh · January 28, 2025, 7:10am

In my scripts, I generally have standalone “pure” functions that does the heavy lifting, and a public API in classes that will use them with resettable parameters and options. → Most of the None-checks are done within init or set methods and call or get methods.

A mini-language might be fit within methods dedicated to None management, assuming it is made clearer than the current “check everything is not None” way.

I know other use cases are for Json, mainly extraction possibly injection, idk…

PEP505 looks a bit to me like the same mini-language but in separate bricks. And while reading threads about it, seeing these ? eveywhere feels easy to write but painful to read, especially when distributed all over entire sections of code. It makes the reading heterogeneous.

TLDR: Just from the separation of concerns principle, one might say there should be dedicated methods for optional args management or json file manipulation, etc… and this is the place where None-op mini-language would fit.

dg-pb · January 28, 2025, 10:35am

Conceptually, yes, but from implementation perspective operators are standard, while “Enclosed DSL” would be a completely new concept. I would guess the implementation would be several times more hefty as relatively few parts could be re-used at each layer.

I have been playing around a bit and for the time being I am at:

+(maybe(None) | 2)    # 2
+(maybe(1) | 2)       # 1

# Attr / item / call
+maybe(None).attr[item](*args)    # None

# This is the "one_true". Or "one of"
mby = maybe(None) / 2 / None
+mby    # 2
-mby    # 1 (index)

# Also indexing can work with "OR"
-(maybe(None) | 2)    # 1
-(maybe(1) | 2)       # 0

# Other NULLS
+(maybe(0, null=0) | None)  # None

so __pos__ does get() and __neg__ retrieves index.

The only drawback is absence of short-circuiting.
But instead of inventing new syntax I think it might be better to implement overloading for boolean operators instead so that users can experiment / make use of these for different short-circuited constructs without needing to propose syntax additions.

See: PEP 335 – Overloadable Boolean Operators | peps.python.org
It has many more applications as well.

hprodh · January 28, 2025, 4:51pm

I looked at pymaybe code, it is working similarly to the linked bool logics, with many __magic__ methods overridden (just not exactly the same way).

I also thought we could do something (very bad in practice) using a with statement : wrap every local reference into linked bool containers at entering, perform the operations and restore initial locals at exitting.
→ This would be an emulation of “performing the ops in a deferred namespace”, which might be a good idea, but is impossible right now (unless we pass the instructions as a string, or operators and operands as arguments, or a bit of both like does numpy.einsum).

dg-pb · January 28, 2025, 6:27pm

Something like?

a = maybe('a[0].attr(arg) or b[0]')

Could get AST, transform it based on rules and evaluate with locals(). Would do short-circuiting and would be very flexible.

Would be nice to have a concept of code-strings, where one could just prepend c which does nothing, except signals IDEs to do syntax highlighting. Such DSLs would be much more attractive.

Context manager would stay within Python without inner DSL, which I like - no issues with syntax highlighting and feels more natural from users perspective.

But not sure how to do short-circuiting for such. Would be interesting to try prototyping to see what is possible.

dg-pb · January 28, 2025, 6:32pm

It might be possible after all.

with maybe(locals()):
    # __enter__: wrap all locals in `LinkedBool`
    # LinkedBool can do short-circuiting as it does not evaluate unless necessary
    # __exit__: `get()` all `LinkedBool` in locals()

dg-pb · January 28, 2025, 6:54pm

My bad, it will not work. locals() can not be set for inner scope, only global.

bwoodsend · January 28, 2025, 11:00pm

I hate to be a party pooper but why are we having this conversation? The original proposal died because:

Many people were squeamish about creating a punctuation ridden syntax
Nobody could agree on whether some_dict?["missing-key"] should give None or a KeyError (or the attribute-getting equivalents)
There were concerns that APIs would evolve to make the need for constant None-awareness more prominent

To propose another punctuation heavy syntax is to miss all the reasons why PEP-505 failed.

gerardw · January 29, 2025, 2:51am

Maybe we’re doing the following:

Being open
Being respectful of differing viewpoints and experiences
Showing empathy towards other community members
Being considerate
Being respectful
Using welcoming and inclusive language

hprodh · January 29, 2025, 9:30am

This would be very practical for the usage of numexpr.evaluate. The variables named in f-strings have their definition checked by the linter btw, but it is not perfectly convenient.

Yes, it might work with globals instead, yet messing with locals or globals is whatever not a way to go. This way of doing can also be seen as an alternative to create an enclosement where some global operators are overriden (just as numexpr do, for example).

dg-pb · January 29, 2025, 9:36am

So my take is that ?? and ??= is the most useful component, which provides most convenience given it is useful everywhere and not just few specific cases.

Furthermore:

And I think the below mostly refers to ?[ and ?.

So my current position is:

Reconsider ?? and ??= for None only. It is straight forward, unambiguous and limits this to 1 operator.
Continue brainstorming on possible approaches to handle more complex cases. Whatever the outcome, be it eventual stdlib addition, 3rd party library or just discovery of optimal methodology to implement such, the fact that this keeps coming up suggests that this exploration might not have reached its natural resolution yet.

dg-pb · January 29, 2025, 10:12am

The biggest issue of this is this:

def foo():
    a = 1
    def bar():
        return eval('a')
    return bar
print(foo()())

It is possible to construct full namespace stack manually (although not sure if can match it 100%, it does come pretty close), but it becomes quite messy and expensive procedure.

I think “t-strings” might be a good bet here.

It would make implementation straight forward, but at the cost that string would need to look like:

t'{a}[{b}].c'

hprodh · January 29, 2025, 10:47am

c-strings appears to me as a convenient way to provide the possibility to implement self-made ‘parsed’ mini-languages.

It would solve this current topic as well as others : numexpr.evaluate, I saw a thread about “chaining functions” some time ago where the prototyping snippets did exactly fit this paradigm.

So it has a great generalizability. Yet the following concern would be raised : “would it encourage bad practices like cryptic coding, or creating code that will not do what it appears to do ?”.