Is there a road towards making bool a fully-fledged Boolean type?

NeilGirdhar · August 30, 2024, 7:50am

This has been something I’ve wanted for a long time and seeing all the support on Paul’s suggestion here, I was wondering if there is any possibility towards making progress towards making bool a non-integer in the long run.

In particular, are any of these feasible in the short or long run?

making bool incompatible with other arithmetic operations like +,-,*,/,%,|,&,^
making int not imply int | bool in type annotations
making bool not be a subclass of either int nor numbers.Integral
any others?

The argument for such a change would be that these arithmetic operations:

do not coincide with how numerical libraries treat Boolean values, (e.g., True + True == 2, but the Array API will treat this as or),
exhibit some footguns (e.g., True & 4 == 0 whereas True and 4 == 4),
are rarely used,
and their old behaviour can easily be recovered with an explicit cast to int.

Also, the smaller the bool interface, the more effective static type checkers are at discovering logical errors (for example, a Boolean flag that leaked into an arithmetic operations by accident).

I realize this is far smaller priority than the ~bool deprecation and more disruptive. I’m more curious about how this balances out.

petercordia · August 30, 2024, 8:27am

I have used +, -, * quite regularly with booleans. I would be sad to see such behaviour lost.
I’m not aware of tidier alternatives for for example

x += 4 * condition

and

sum(test(i) for i in range(N))

(edit: that first one does have if condition: x += 4 as an alternative, but I don’t think the point is entirely invalid. 4 * condition is shorthand for (4 if condition else 0) and more readable imo)

storchaka · August 30, 2024, 9:33am

It would be too distruptive. Too many code depends on bool be compatible with int. Some examples from the stdlib code:

d = "rf"[self.isearch_direction == ISEARCH_DIRECTION_FORWARDS]

Could be written as

d = "f" if self.isearch_direction == ISEARCH_DIRECTION_FORWARDS else "r"

return sum(value == entry for entry in self)

Could be written as

return sum(1 for entry in self if value == entry)

ndays = mdays[month] + (month == FEBRUARY and isleap(year))

Could be written as

ndays = mdays[month]
if month == FEBRUARY and isleap(year):
    ndays += 1

Such code is not buggy or error-prone, but you should rewrite it if bool is less like int. The only painless way to do this is to borrow a time machine, travel 34 years ago, and convince Guido that it needs a distinct boolean type (it was before long integers, ternary operator, iterators and generators and bytecode).

MegaIng · August 30, 2024, 9:41am

All of these would still be functional by having an extra call to int, and the first one should actually be functional anyway since the new bool can just implement __index__.

pf_moore · August 30, 2024, 9:52am

To be clear, I agree with @storchaka here, in spite of my post being quoted as the motivation for this idea. The point I was making was that I didn’t see the benefit of just changing one thing about bool - not that I supported an overhaul of behaviour that’s been around and served us just fine for over 30 years…

If anyone is genuinely serious about this proposal, they’ll need to justify the breakage that would be caused by changing something this basic that’s been in the language for so long. And “you just have to add in an extra call to int” won’t cut it - who’s volunteering to go through the many millions of lines of Python code in existence, a significant portion of which isn’t public, and much of which supports critical business logic, and make all those changes?

oscarbenjamin · August 30, 2024, 10:08am

The simple answer to any question of this form is:

x += 4 * int(condition)

Personally I would write:

if condition:
   x += 4

I prefer this because it makes it clear that x is not being modified when condition is False and I don’t want to actually execute the * and the + if that is the case.

In the other thread I suggests that a non-int bool type could still support arithmetic operations although others disliked that idea. The reason I suggested it is because while changing isinstance(True, int) is a compatibility break, the fact is that 99.9% (made up number) of the breakage would come from breaking arithmetic. Also while some people might not like arithmetic with booleans it is at least unambiguous and is done through the explicit use of arithmetic operators. It can easily just fall into the category of things that are possible but that you prefer not to use in your own code (as it already does for me). I assume that the original motivation for making bool a subclass of int came from wanting to avoid breaking arithmetic with conditions like this.

storchaka · August 30, 2024, 10:23am

We change potentially dangerous things that are not in large use and keep useful and harmless things. I think this is a right approach.

storchaka · August 30, 2024, 10:28am

Agree, but there is still 0.1% of code (mostly different kind of serializers and dispatchers) that depends on bool been a subclass of int. This may be not a large issue. But why? Why spend so much effort to imitate current behavior if the current implementation works pretty well? There is no large issue to be solved.

NeilGirdhar · August 30, 2024, 11:48am

More inspiration than motivation! I just thought the other thread and your comment had a lot of worthwhile background reading for this thread, which people responding might want to read. Hope you didn’t feel I was pinning this idea on you

Regardless, I appreciate all of the answers I’ve received so far. I always learn a lot.

In general, I’m very idealistic, and I live (and propose ideas) as if I had that time machine that Serhiy was talking about

ilotoki0804 · August 30, 2024, 1:33pm

Multiplying a collection by a boolean is quite useful.

For example, it can handle whether a specific string should be included or not.
```
number = 2
# There are 2 apples.
print(f"There are {number} apple{'s' * (number > 1)}.")
```
It is also useful when deciding whether to include a specific element in a collection.
```
extensive = True
cases = (1, 2, 3, 4) + (5, 6) * extensive
```
XOR has no corresponding boolean operator, thus the bitwise ^ is the only method available^[1].
Other bitwise operations are used in ‘eager operation’.

For example, bool1 or bool2 does not evaluate bool2 if bool1 is True, whereas bool1 | bool2 evaluates bool2 regardless of the value of bool1.

You could use bool1 + bool2 == 1, but… That’s another operation, isn’t it? ↩︎

NeilGirdhar · August 30, 2024, 1:42pm

Just FYI, many collections don’t support multiplication by integers or Booleans. set is a collection. Some sequences (tuples, lists, and strings, e.g.) do support it. I think it doesn’t hurt to cast to integer or using a ternary for this case for the sake of readers, but I understand the appeal of multiplying by a Boolean.

I would use x != y personally since it conceptually works with a simple mental model that relies on Boolean values only. You don’t have to imagine that the Boolean values are standing in for integers like you do with ^ or x+y==1.

I think this would be extremely confusing to do to a reader of code without an accompanying comment. That would open up the potential of subtle bugs if someone were ever to refactor the code to use Boolean operators or some other way. If you really need both branches to be evaluated, then evaluate them on separate lines with a comment saying that the side effects are necessary. Then use the appropriate Boolean operator.

chepner · August 30, 2024, 2:55pm

That doesn’t require bool be a subclass of int, just mappable to int. You could still write

print(f"There are {number} apple{'s' * int(number > 1)}.")

with the same semantics, but I would argue that even now

print(f"There are {number} apple{'s' if number > 1 else '')}.")

is clearer.

jamestwebber · August 30, 2024, 2:59pm

I would argue that all of these should use number != 1 to properly handle the zero-apple scenario.

Oh, and also that use of “are”…

stoneleaf · August 30, 2024, 5:20pm

It seems like the easy solution to the mathematical issues is to auto-promote bools to ints when those operations are attempted.

ChrisBarker-NOAA · August 30, 2024, 5:45pm

Isn’t that what already happens?

In [**1**]: x = **True**

In [**2**]: x += 5

In [**3**]: x

Out[**3**]: 6

In [**4**]: type(x)

Out[**4**]: int

Maybe I don’t know what auto-promote means?

Is it:

In [15]: b = True

In [16]: isinstance(b, int)
Out[16]: False

but you could still do math with them?

Which I suppose would help the static typing folks, but I’m a typing-skeptic, so I don’t have an opinion about that.

stoneleaf · August 30, 2024, 5:48pm

It is. We could maintain that behavior even if bool was not a subclass of int.

oscarbenjamin · August 30, 2024, 6:29pm

I would restrict it to +, - and * or in dunder terms pos, neg, add, radd, mul and rmul. These are the only operations where it is useful to use a bool as an int and could be something like:

def __add__(self, other):
    return int(self) + other

That would then work for multiplying int, list etc.

Other arithmetic operations like **, /, % could be disallowed although possibly << is reasonable. You could also disallow mixing bool and int in bitwise binary operators like 2 & True while still allowing &, |, ^ when both operands are bools (and of course having ~ work like not).

Dutcho · August 30, 2024, 8:34pm

If we go this way (unlikely), could we also consider (or at least not rule out) extension to 3-state logic as a Boolean subclass?

barry · August 30, 2024, 9:34pm

Can we start with reverting the decision on True and False being introduced in Python 2.2.1? /me ducks!

ncoghlan · August 31, 2024, 3:40am

Less disruptively, static type checkers could behave as if they were independent types regardless of how they’re actually implemented.

So CPython would keep assert issubclass(bool, int) as an implementation detail, but formally bool would be its own type that was only situationally usable as an integer without an explicit cast.