This has been something I’ve wanted for a long time and seeing all the support on Paul’s suggestion here, I was wondering if there is any possibility towards making progress towards making bool a non-integer in the long run.
In particular, are any of these feasible in the short or long run?
making bool incompatible with other arithmetic operations like +,-,*,/,%,|,&,^
making int not imply int | bool in type annotations
making bool not be a subclass of either int nor numbers.Integral
any others?
The argument for such a change would be that these arithmetic operations:
do not coincide with how numerical libraries treat Boolean values, (e.g., True + True == 2, but the Array API will treat this as or),
exhibit some footguns (e.g., True & 4 == 0 whereas True and 4 == 4),
are rarely used,
and their old behaviour can easily be recovered with an explicit cast to int.
Also, the smaller the bool interface, the more effective static type checkers are at discovering logical errors (for example, a Boolean flag that leaked into an arithmetic operations by accident).
I realize this is far smaller priority than the ~bool deprecation and more disruptive. I’m more curious about how this balances out.
I have used +, -, * quite regularly with booleans. I would be sad to see such behaviour lost.
I’m not aware of tidier alternatives for for example
x += 4 * condition
and
sum(test(i) for i in range(N))
(edit: that first one does have if condition: x += 4 as an alternative, but I don’t think the point is entirely invalid. 4 * condition is shorthand for (4 if condition else 0) and more readable imo)
It would be too distruptive. Too many code depends on bool be compatible with int. Some examples from the stdlib code:
d = "rf"[self.isearch_direction == ISEARCH_DIRECTION_FORWARDS]
Could be written as
d = "f" if self.isearch_direction == ISEARCH_DIRECTION_FORWARDS else "r"
return sum(value == entry for entry in self)
Could be written as
return sum(1 for entry in self if value == entry)
ndays = mdays[month] + (month == FEBRUARY and isleap(year))
Could be written as
ndays = mdays[month]
if month == FEBRUARY and isleap(year):
ndays += 1
Such code is not buggy or error-prone, but you should rewrite it if bool is less like int. The only painless way to do this is to borrow a time machine, travel 34 years ago, and convince Guido that it needs a distinct boolean type (it was before long integers, ternary operator, iterators and generators and bytecode).
All of these would still be functional by having an extra call to int, and the first one should actually be functional anyway since the new bool can just implement __index__.
To be clear, I agree with @storchaka here, in spite of my post being quoted as the motivation for this idea. The point I was making was that I didn’t see the benefit of just changing one thing about bool - not that I supported an overhaul of behaviour that’s been around and served us just fine for over 30 years…
If anyone is genuinely serious about this proposal, they’ll need to justify the breakage that would be caused by changing something this basic that’s been in the language for so long. And “you just have to add in an extra call to int” won’t cut it - who’s volunteering to go through the many millions of lines of Python code in existence, a significant portion of which isn’t public, and much of which supports critical business logic, and make all those changes?
The simple answer to any question of this form is:
x += 4 * int(condition)
Personally I would write:
if condition:
x += 4
I prefer this because it makes it clear that x is not being modified when condition is False and I don’t want to actually execute the * and the + if that is the case.
In the other thread I suggests that a non-int bool type could still support arithmetic operations although others disliked that idea. The reason I suggested it is because while changing isinstance(True, int) is a compatibility break, the fact is that 99.9% (made up number) of the breakage would come from breaking arithmetic. Also while some people might not like arithmetic with booleans it is at least unambiguous and is done through the explicit use of arithmetic operators. It can easily just fall into the category of things that are possible but that you prefer not to use in your own code (as it already does for me). I assume that the original motivation for making bool a subclass of int came from wanting to avoid breaking arithmetic with conditions like this.
Agree, but there is still 0.1% of code (mostly different kind of serializers and dispatchers) that depends on bool been a subclass of int. This may be not a large issue. But why? Why spend so much effort to imitate current behavior if the current implementation works pretty well? There is no large issue to be solved.
More inspiration than motivation! I just thought the other thread and your comment had a lot of worthwhile background reading for this thread, which people responding might want to read. Hope you didn’t feel I was pinning this idea on you
Regardless, I appreciate all of the answers I’ve received so far. I always learn a lot.
In general, I’m very idealistic, and I live (and propose ideas) as if I had that time machine that Serhiy was talking about
Just FYI, many collections don’t support multiplication by integers or Booleans. set is a collection. Some sequences (tuples, lists, and strings, e.g.) do support it. I think it doesn’t hurt to cast to integer or using a ternary for this case for the sake of readers, but I understand the appeal of multiplying by a Boolean.
I would use x != y personally since it conceptually works with a simple mental model that relies on Boolean values only. You don’t have to imagine that the Boolean values are standing in for integers like you do with ^ or x+y==1.
I think this would be extremely confusing to do to a reader of code without an accompanying comment. That would open up the potential of subtle bugs if someone were ever to refactor the code to use Boolean operators or some other way. If you really need both branches to be evaluated, then evaluate them on separate lines with a comment saying that the side effects are necessary. Then use the appropriate Boolean operator.
I would restrict it to +, - and * or in dunder terms pos, neg, add, radd, mul and rmul. These are the only operations where it is useful to use a bool as an int and could be something like:
def __add__(self, other):
return int(self) + other
That would then work for multiplying int, list etc.
Other arithmetic operations like **, /, % could be disallowed although possibly << is reasonable. You could also disallow mixing bool and int in bitwise binary operators like 2 & True while still allowing &, |, ^ when both operands are bools (and of course having ~ work like not).
Less disruptively, static type checkers could behave as if they were independent types regardless of how they’re actually implemented.
So CPython would keep assert issubclass(bool, int) as an implementation detail, but formally bool would be its own type that was only situationally usable as an integer without an explicit cast.