Mismatch between assert's semantics and how it's used (-O, -OO, disable)

TL;DR Python’s assert statement in a NOP in optimized mode (python -O or python -OO). There exists code that assumes this, and much much more code that is ignorant of this and broken when running with -O. Lets do something about this, trying not to break existing code.

This was discussed here before at length without getting anywhere, so I’ll try to make a summary of the discussion and then offer some thoughts and a few ways forward.

  • -O strips asserts and defines __debug__ as False (and if False: stuff is always stripped by the compiler), -OO also strips docstrings
  • Guido is against changing the language, prefers to deprecate -O and -OO
  • This gotcha is featured in lists of python security pitfalls
  • The fact that it’s often not the same people who write the code and who run it (especially in the case of dependencies) makes it hard to push the idea that “it’s alright because you can always choose not to use -O”, and this alone steers people to use neither -O nor assert.
  • There is at least one library (click) that relies on assertion stripping to run slow checks in performance-critical sections
  • There are multiple volunteers to write a PEP if a core dev would sponsor it
More things that were discussed
  • Python’s assert was modeled after C’s assert() macro, which is stripped by -DNDEBUG (which is a separate argument than -O)
    • Traditionally it was stripped from release builds [Editor’s Note: however, these days, at least commercial C/C++ projects tend to keep assertions in release, to the point that I consider it a best practice. CPython strips them)
    • The idea comes from Eiffel, where they serve a very specific purpose and have very specific semantics [EN: and some of them are kept in release by default and others stripped]
  • Theoretically, if someone writes an assert that depends on user input and not just on the programmer not being wrong, they’re “using it wrong”
    • In practice…
    • Many have a dislike for code behaving differently in test and production, as it creates hard to debug issues
    • Some people think AssertionError should have been a BaseException and any code that handles it is suspicious, because if programmer assumptions have been broken, any bad thing might happen. In practice, web servers catch it and log an error and it’s probably wise
  • The non-stripped assertion behavior can’t be easily replicated with a function, because assert only evaluates the second parameter if the assertion fails and functions always evaluate all arguments, also because an assert is very self-documenting in a traceback (when the bottom line of a traceback is assert author is not None, there’s no need for a message “author is None”)
  • Rust has assert! and debug_assert!
  • Surprisingly, pytest assertions are not stripped with -O because of their AST magic

I think @Tinche said it best:

As a person in charge of operating a large-scale Python deployment, am I expected to run with -O in production or not? If the answer is ‘yes’, I’ve been doing it wrong for a large number of years and we absolutely need to communicate this better, and it complicates the deployment story further. If the answer is ‘no’, asserts should always work.

My own novel thoughts:

assert is my favorite python statement, and in 15 years of python the only place I’ve encountered -O until today was in github issues asking to replace it with the cumbersome if not condition: raise AssertionError(msg)

Maybe if assertions are not “used wrong” we don’t need to run them in production, but I’ve yet to meet a CTO that would prefer the perf gains to the loud crashing behavior when the impossible does happen. This is Python, not C, if we cared about the performance tax of one more statement and a few variable lookups and attribute lookups, we’d be writing that function in Rust or something. Assertions that call a costly function are very rare.

However, assertions that are “using it wrong” are very common, which means -O is completely broken and anyone using it does it at their own risk. stdlib is full of assertions that check for things determined by the non-stdlib caller, which means they’re broken in -O. See e.g. the Threading module, GH-106236.

There are rare cases where we do care about them being optimized out in production, so may I propose that we add new syntax for that… how about:

assert not __debug__ or condition, message

(Joking, this is not new syntax, it’s an idiom that achieves exactly the same behavior and is more explicit about it)

So my proposal is:

  1. Deprecate -O and -OO, replace with -O=1 and -O=2 that don’t strip asserts
  2. Recommend assert not __debug__ or condition, message as an alternative in the deprecation message and in the docs (so if you google __debug__ the first result will be an explanation of the idiom)

I volunteer to personally pore over the standard library and fix all uses where performance matters, though I couldn’t find any in a preliminary search. And also over click and any other open source library who’s maintainer chimes in that they care about this.

So we have a deprecation that fixes many bugs in the stdlib and in user code and requires (only) affected users to change how they call python to indicate they’re aware of the change, an explicit way to opt-in to the surprising behavior that already works, and even if someone misses the deprecation, in theory the only change they’ll experience is a performance regression, and in practice they might also get their bacon saved by an explicit crash instead of silent misbehavior in production.

Will someone sponsor this?

6 Likes

If most people don’t use -OO, and if a language change were acceptable, how about prefixing with try, i.e. try assert <condition>, <message>, the implication being that it’s tried only if you’re debugging?

What about keeping assert’s semantics the same, but having a new ensure (which might only be a builtin, not a keyword), which has the semantics people expect of assert? I don’t like the idea that identical code will have subtly different behaviour on different versions, but perhaps it’s sufficient to tell people “hey, if you’ve been using assert like this, you probably should switch to ensure instead”.

Why do a language change with all the involved pain when we can achieve exactly the same result with existing constructs? assert __debug__ and condition, message does exactly what we want.

This has the disadvantage of not doing anything about the huge amounts of code that is out there, much of it in the python standard library, that “uses assertions wrong”.

True, but even if you do “do something” about that code, it will continue to be wrong on existing Python versions, AND it will have the perceived blessing of now being “correct code”. That means that code will be sanctioned as “good code” while being subtly wrong, which IMO is worse than the status quo of it being subtly wrong but acknowledged as bad code.

I don’t see why? Anyone could still go and fix it for existing versions on a case by case basis by replacing assert x, m with either assert not __debug__ or x, m or if not x: raise AssertionError(m). This fix is even forward-compatible. I just don’t see anyone actually doing it, and an ensure builtin won’t improve the situation.

So -O remains unusable in practice on existing versions, just like it is right now, but at least its replacement -O1 becomes usable in the next version (and more importantly, assert becomes usable, and existing subtly-wrong uses are retconned to be correct).

The reason they won’t do it is because there’s no need to - if the code is correct, why change it? But if the code is wrong and there’s an easy solution (“use ensure instead of assert”), there is a stronger incentive to change it.

But maybe I’m the wrong person to be discussing this, since I have never used assert for anything that I wouldn’t have been willing to write as a comment.

I don’t think it does do what you want. You’d expect the assert to always pass if __debug__ is False, but instead it would always fail every time.

1 Like

Yeah…If the idea is to replace -O and -OO with versions that don’t strip asserts, this changes subtly wrong code into crashing code. I think it should be assert (not __debug__) or condition, and maybe the optimizing flags are primed to notice that construction and skip it entirely.

IMO that’s exactly the right way to look at assertions. I do the same. And I object to this proposal because it penalises people who do the right thing, while rewarding people who do the wrong thing (by changing the language to accommodate them).

5 Likes

Wow, what an embarrassing mistake. Edited. Thank you.

I object to this, I think making it easy to write assertions that run in production and possible but a bit harder to write assertions that only run in debug is the opposite of penalizing people. It’s making python more pythonic. We don’t have to count cycles here, this is not the 80s :slight_smile:

But also, I think that ship has sailed. All those people who do the right thing can’t use -O today because the stdlib and their other dependencies are full of assertions that we would mind replacing with comments. This proposal fixes that so they can start using -O fearlessly.

2 Likes

In cases where it’s PURELY a matter of performance, you aren’t entirely wrong - for example, repeated addition onto a string can be made faster under some circumstances, and CPython has an optimization for this. But the difference between repeated addition and "".join(items) is not one of correctness; they’ll both reach the same result, just that one might be horribly suboptimal.

Where in the stdlib are assertions used that mean that we can’t trust a -O run? Those should simply be bugs to be fixed.

It means expecting people whose code is already correct to make changes. That’s penalising them when their code is currently correct.

Precisely. And more generally, if anyone has code that does the wrong thing when run under -O, then either that’s a bug they should fix, or they should state that they don’t support running their code under -O (and be prepared for the fact that other people may ask them to fix their code because "won’t run under -O isn’t a reasonable limitation to have…).

I’m honestly rather baffled that people think it’s reasonable to change the language because some people failed to understand how an existing statement works.

4 Likes

This feature is more often misunderstood than not, to a point that there are at least dozens of offenders in the stdlib alone. See e.g. most uses of assert in lib/Threading.py.

So we’re proposing to deprecate it (specifically, -O) and replace it with a differently-named and better-designed feature that won’t get misused in this way. As a nice side-effect, we retcon almost all existing usages to be correct, making -O (in its new form, -O1) usable again. We’re making things more explicit and failures less silent, two Python principles. The only negative changes we introduce are very rare performance regressions (the only material one I’ve ever heard of is click CLI loading times, and I’ll personally fix that one) that are easy to fix.

1 Like

Another data point, some enterprise “security scanners” flag every assert primarily because of this issue/confusion. OTOH other tools of the same nature get upset when you don’t turn off __debug__.

We recently got flagged by 11k “violations” in our code for asserts.

So it isn’t just devs misunderstanding, but also enterprise security tools.

I agree with OP that the current state of things isn’t ideal.

1 Like

Okay. Let’s do a survey. Line numbers will be from a 3.12 pre-alpha, taken from a fresh pull as of today (2023-07-09), but it shouldn’t make a material difference.

  • Line 718, Barrier._enter():
    assert self._state == 0
    Looks correct. It’s asserting that an internal attribute has the one remaining possibility, after waiting until it’s in neither two and then testing if it’s below zero. Presumably the
  • Line 743, Barrier._wait():
    assert self._state == 1
    Just like the previous one.
  • Line 884, Thread.__init__():
    assert group is None, "group argument must be None for now"
    This one’s borderline, since it’s using an assert to test a function argument, but it’s an argument that has no meaning currently, so nobody should ever be passing it at all. And if you DO pass that argument, it won’t do anything, so even with the assertion ignored, it’s not going to cause any problems.
  • Line 939, Thread.__repr__():
    assert self._initialized, "Thread.__init__() was not called"
    Unsure on this. Under what circumstances can you get the repr of something that hasn’t been initialized?
  • Line 1077, Thread._stop():
    assert not lock.locked()
    Internal method that is ensuring that internal state is correct (the lock here came from self._tstate_lock).
  • Line 1142, Thread._wait_for_tstate_lock():
    assert self._is_stopped
    Another internal method, another test of internal state.
  • Lines 1167, 1172, 1184, 1196, 1207, 1225:
    assert self._initialized, "Thread.__init__() not called"
    I suppose you could argue that this is wrong, but if you’re not calling an object’s initializer and then get peculiar AttributeErrors, it’s on you.
  • Lines 1454 and 1458, DummyThread methods:
    Basic checks for stuff you can’t actually do with dummy threads.
  • Line 1579-80, _shutdown():
    assert tlock is not None
    assert tlock.locked()
    Internal method testing internal state.
  • Line 1676, _after_fork():
    assert len(_active) == 1
    Internal method testing internal state.

So there are a couple of dubious ones, but I absolutely would NOT agree that “most” uses of assert are incorrect here. Earlier, you claimed:

which, if it is indeed true, certainly isn’t supported by what I can find here. There are really only two ways that this could be triggered by the caller are (a) passing a parameter that shouldn’t be passed, and which currently does nothing; and (b) subclassing and failing to call init. Or, of course, messing with internal state, but then all bets are off anyway.

2 Likes

Well, if that’s what you have to work with, there’s not a lot of point changing the assert statement, since it wouldn’t be accepted anyway. But you will please permit me for considering this rule ridiculous.

It is ridiculous, I almost laughed myself out of my chair.

But Theresa nugget of reality in it. The thinking is that a lot of people don’t know that asserts can be turned off (and by a flag with a name that doesn’t lend itself to knowing it turns off asserts) and so they might use asserts for actual validation. It’s extremely plausible, and likely comes from having seen it happen before.

So, as dumb as “don’t use asserts” is. It can be a realistic nugget of wisdom in the greater community. It’s dumb to you and I, but not so dumb for the layperson

1 Like

I did a quick scan. None of the asserts I saw seemed wrong. They state that a certain condition is expected to be true (in the sense that it’s a bug in the code if it’s not - not in the sense of validating user input or anything like that). These can be safely omitted under -O because they are not affecting the code’s behaviour under supported conditions.

I dispute your assertions that these are “offenders”. Please provide a specific example of an assertion in the stdlib that is incorrect, in the sense that it checks a condition that can actually happen, and where the AssertionError is an expected result of the code in that case. Or, to put it another way, please describe exactly how the assertion can be triggered by supported usage of the relevant code.

To be precise, you’re proposing to deprecate code that is used correctly by a lot of people. That’s a major compatibility breakage, and you don’t seem to appreciate that Python takes such breakage very, very seriously.

That is absolutely what I’m describing as “punishing the people who use the feature correctly”. You give an amnesty to people who don’t read the documentation and don’t use the language correctly, and expect people who do use the language correctly to change their code.

Then they are wrong. Either report a bug to them, ignore their reports, or treat the alert as a warning - review the code and if it’s incorrect, fix it. If it’s correct, do nothing (and be glad your developers don’t make at least one common error).

I agree the current state isn’t ideal. We apparently aren’t documenting the assert statement well enough to stop people making mistakes when using it. We should fix that, by teaching correct usage better and helping people who are currently using assert incorrectly to fix their code.

4 Likes