Mismatch between assert's semantics and how it's used (-O, -OO, disable)

And in fact, the test suite is regularly run both with and without -O, so it’s definitely not the case that the code is broken under -O (unless the OP wants to try to claim that the test suite isn’t a sufficient check that the code is correct, in which case the first step would be to suggest additional tests that improve the test suite!).

I’d like to point out that 0 of the cases in threading.py (or anywhere else I looked in the stdlib) are asserts that we would prefer not to run in production. All of them have negligible overhead relative to their surrounding code, and not running them in production creates hard to troubleshoot issues if their conditions ever do trigger.

I wrote this post because of `_DummyThread`s can be joined in `-OO` mode · Issue #106236 · python/cpython · GitHub, which I’m guessing submitter @solobevn was burned by: _DummyThreads can’t be joined in dev (they raise) and also according to the docs, but don’t complain in production. You could argue that it isn’t “broken” in one sense, but I could just as easily argue that it is in a different sense, and I think a more important sense - it’s very bad UX, and specifically very unpythonic behavior (Although practicality beats purity, Errors should never pass silently). We can do better than that.

(To be clear: I am backtracking from my claim usage of the words “broken code” in threading.py, it does indeed seem that all assertions are about internal state or things that it’s documented that you’re not allowed to do. I did encounter examples elsewhere in stdlib where it’s clearly testing user input, e.g. something about XML parsing, I’ll try to post references after work.)

Can you please give more examples where people who write asserts would be punished by the performance regression, if you know any off the top of your head, except for the single one that came forward in the year during which the previous discussion was active, click? I’m not asking to gain points in the discussion, I’m asking in case there is some dev community where it’s a common thing, so I can become convinced that I was wrong.

Can that bug be triggered without calling private methods? The tracker issue seems to require the inspection of threading._active. Without that, my understanding is that you would have to have a thread that is NOT started via the threading module, which calls threading.current_thread(), and then which passes that to another thread to attempt to join it. That seems like a pretty bizarre thing to do, and sure, maybe it should raise a different exception instead of asserting, but it’s hardly normal operation. Oh, and all this has to happen in production despite never having been run in development mode, otherwise assertions would be active.

So, yes, technically a problem (and I may have been a little too dismissive in my quick summary above), but hardly evidence that the stdlib is full of faulty assertions.

Side point: Was this PR actually the result of someone being burned by this problem, or was it that they found it in the source code and wanted to fix it? You’re guessing the former, but do you have any evidence to support that? It looks to me like more of a “huh, this looks a bit odd, we should fix it” issue rather than a “wow, my code isn’t working, I wonder why” one.

Please do. I flipped through Lib/xml/ and didn’t find any assertions that looked bad to me. Again, there might be some that I missed, but by no means the majority. There’s one assert systemId is not None that might be better done differently, but I can’t find any documentation for the xml.dom.xmlbuilder module so it’s possible that this is an entirely internal class anyway.

1 Like

It’s not about the performance regression, it’s about the idea that the “correct” way to write an assert is assert (not __debug__) or condition. And if you genuinely think that projects wouldn’t get drive-by issues demanding that they “fix” their assertions to use the recommended form, then you clearly haven’t maintained a popular project :slightly_frowning_face:

Good. Thank you for accepting that point. If you do find cases where asserts are being used to validate user input, they would be genuine bugs, and it’s reasonable to report the issue and ideally offer a PR fixing it - although given the comments in this discussion, I’d strongly encourage you to be careful that the issues you are raising are actually bugs. At this point, you’ve unfortunately established something of a reputation for being too quick to claim something is a bug…

I don’t think you are wrong that the performance benefit of stripping asserts in -O mode is normally pretty small. But the key there is “normally”. I’m sure there are cases where the performance matters, and expensive to compute asserts are a problem. I’m equally sure that there are some people who want to squeeze every last bit of performance they can from Python, and for whom stopping -O from stripping asserts would be a major regression. You can argue that not enough people would be affected to block your proposal, but unless you are able to back up such a claim (and that’s really hard to do!) the default is to play it safe.

To put it another way, the responsibility is on you, as the proposer, to demonstrate that the impact is sufficiently small, and the benefit sufficiently large, to justify the change - taking into account Python’s policy on backward compatibility and historical caution when it comes to breaking changes.

1 Like

I see only an educational problem here. For some reason, some beginners in programming (for which Python is the first programming language) believe that assert is intended for runtime checks. They are wrong. If the problem is with the documentation, we should improve the documentation.

Perhaps making Python to run with optimization level 1 by default would help to prevent this issue. Or maybe we should introduce an optimization level 0.5, at which assert would just emit a warning. We can even do this depending on whether the assert is in the __main__ code or in a module imported from the current directory (this will cover most beginner’s code) or from other code (the stdlib and other libraries and production-level programs).

6 Likes

I think this misses the larger point (possibly because I didn’t call out the larger point until a separate follow-up message), and focuses on the ideal world. It misses that the security policy does have roots in reality (they just took the painful, don’t-trust-anyone approach). It misses that companies have security policies that individual engineers might not have the power to override. Nor could anyone, or even a team, look at each 11k reports to review their behavior.

Ideally, I completely agree with everything you suggest. But I don’t think the data point itself can be so easily shrugged off. It highlights this is something that real engineers have to deal with. And it stems from behavior of Python itself. Maybe you or I could push back on our security teams, especially considering we understand the nuance. But does everyone else who received a similar report to mine?

I try to be really cognizant of “The Curse of Knowledge” in my own ways of navigating these kinds of discussions regarding “newbie” behavior. We’re armed with knowledge we can use for our own benefit. But are most Python developers Pythonistas? Should they have to be?

I’m not sure the core of the issue here is documentation. I agree people should understand the code they’re writing. However, I don’t think users should have to read https://docs.python.org/ before they crack open their editor. They copy code from StackOverflow. They copy code from ChatGPT. They copy code from other libraries.

They make assumptions based on their past experiences in Python, and in other languages. None of these lend themselves well to the highlighted issue, which is sometimes Python just skips your statement based on a flag. That’s not normal behavior. In fact, when they run their code they see it execute. When they run their tests they see it execute. When they run their console_scripts they see it execute.


In an ideal world, where everyone understands the ins-and-outs of Python and how to run it. I’m right there with you. But… In an alternate ideal world, where Python follows the element of least surprise, I’m going to advocate for the Python newbie who made a very sound assumption, and say that maybe we consider what it would take to make the reasonable assumption correct?

Depending on which other language, this could be entirely unsurprising. C has a number of ways that you can compile out code, making it so something simply doesn’t exist in the binary even though it was in the source.

But none of this changes the fact that, even if Python’s assert IS considered to be a problem, it’s wrong to punish the people who’ve used it correctly in order to reward those who’ve been using it wrongly. There needs to be a solution that has working code continue to work.

Yeah, I compared this with C/C++ in my head. But even there, something at the source-level opts into skipping the code. #if blocks or using a macro both are source-level. You can see the decision point the compiler makes (even if the obfuscation is high in the case of macros). In this case, you can’t see a decision point being made by the interpreter. You just see the statement.

An aside on why I think asserts are also prevalent

It leads to one less branch in your source code. assert thingamabob is easier on the brain than if not thingamabob: raise SomeError("...")

So some people gravitate towards the less-code approach.

IIUC it isn’t that assert is a problem. It’s that it may not be executed. Unless I misunderstood the proposal, assert wouldn’t change, and therefore users of it (right or wrong) wouldn’t be punished. Rather Python would shift to a model where asserts aren’t stripped.

1 Like

I don’t understand what you are trying to say: C and C++ have an assert() macro that can be compiled out depending on the compilation options. How is that different from the assert statement in Python?

It’s a bit of a nuanced argument since we’re comparing compilers to interpreters, but it’s a macro which is still evaluated by the compiler. It always gets expanded. And then based on that expansion the inner bits may or may not get compiled in.

But it isn’t like the compiler just skipped the macro completely. It still had to consider its existence.

It’s akin to having:

def assert(truthy):
    if not __debug__:
        return
    ...

If we had assert(False), the Python interpreter evaluates it. Even if the inner bits may not get interpreted. But assert False is different. The interpreter just skips it.

I hope that makes sense?

Again, maybe it’s a weak comparison at best, because we’re comparing compilers to interpreters. But in that case, we really shouldn’t be comparing Python to C and the earlier point stands that assumptions based on interpreted languages likely are that the interpreter doesn’t sometimes skip statements.

That’s not how assert() works. When compiled with -DNDEBUG, assert() is expanded to be empty. This code compiles just fine when compiled with -DNDEBUG:

#include <assert.h>
#include <stdio.h>

int main(int argc, char **argv) {
  assert(bla bla);
  return 1;
}

I actually think asserts are harder on your brain than if-raise, and I felt this especially when refactoring a codebase that was using asserts to if-raise (not in Python; those asserts were guaranteed to run in production). Here’s why:

if expr:
    print("this code runs if expr is True")

while expr:
    print("this code runs while expr is True")

assert expr, "this is raised as an exception if expr is _False_"

This negative logic makes it harder to parse, compared to other cases of branches in the code. You also need to remember that and negate expr while writing a good error message for the assertion (and you do want to write a good error message, so that there’s a chance of figuring out what went wrong without a traceback).

There’s always going to be something that some people, newbies or not, may consider surprising/unintuitive/broken (mutable default arguments), or things that you need to read docs to understand (else clause in for and while loops). Some language features are just more advanced, even though they might look simple, and for those features, educating people would lead to better outcomes.

How is that different? The assert statement in Python is a decision point! Consider:

x = 1
def foo():
    while ():
        x = 2
    print(x)

The entire while statement gets optimized out, but the only source code hint that this happens is that you can see that the condition is false. How is that different?

But youo’re telling people that they need to change their code in order to retain the same behaviour, which has been correct all this time. In order to express the exact same semantics, programmers will need to add extra boilerplate into their assertions. This is punishing to maintainers of software that has been using assertions correctly, all for the benefit of people who’ve been using it wrongly.

I don’t see why you think it’s less surprising that asserts are retained in production. When I started using Python, coming from C, Python’s assert worked exactly as I expected, because -O was exactly equivalent to C’s -DNDEBUG.

I was going to say “… and it’s like Rust’s assert! macro”. But I was surprised to find that Rust’s equivalent is debug_assert! and assert! macros are checked in production code.

So who’s to say that Python is “less surprising” rather than Rust? I concede that a person coming from Rust may be surprised by Python’s behaviour, but someone coming from C would find Python’s behaviour entirely natural, and would be surprised by Rust.

You don’t get to pick your newcomer’s background, sorry. And I don’t consider the assumption that asserts remain in production to be “very sound” in any objective sense.

2 Likes

I think the reason is more cultural than technical. Not only is it the default that Python keeps asserts at runtime, it’s overwhelmingly the case in practice: while most C and Rust programmers use both debug and release builds, I’ve yet to work with a Python programmer who uses -O regularly.

If, like Rust, Python programmers typically ran tests with debug and ran in production with optimised, we wouldn’t be having this conversation. As it stands, I’d venture most people don’t know about -O, which is not the ideal situation for a flag that can make statements disappear from your program.

6 Likes

I’m much more likely to be bitten by the removal of docstrings, since I’ve had setups that use the docstring for machine-readable information (eg configuring argparse with a decorator and a docstring). But I’ve never done that in a library (to my knowledge), only in the top-level app, so it simply means that that app can’t be run -OO.

According to assert - cppreference.com it is a macro which does expand to #if .... Although I think we’ve crossed the line into being pedantic.

I still don’t follow. As I understood the OP. Normal asserts are still the same. It’s only if you truly don’t want them to fire when __debug__ is off that you would need to edit (and then you’d edit to be assert __debug__ or XYZ). And those cases should be rare. At least that’s my understanding. I certainly wouldn’t +1 if this meant changing every assert, but I don’t think that’s the intent here.

I think it boils down to who we’re designing Python for. Here’s a proposal where people who made bad assumptions don’t get penalized. And that’s great, to me at least, because Python certainly attracts people across the spectrum of programming. On the other hand, the status quo is that some assumptions are correct, some aren’t. And if you’re on the wrong end (and you’d really never know), whoopsie!

… which circles back to my point that this kind of decision does play out in many ways that should be considered.

I’ll tap out, though. I’ve stated the data point and also argued my personal perspective. I don’t wanna be too noisy.

But those cases are the only cases that are currently correct. So what you’re saying is that the vast majority of all assertions are currently buggy, or alternatively, that you want to change the behaviour of all current assertions. Do you see how this is a problem?

And where people who made good assumptions (or who actually read the documentation) do.

I think a disconnect in this thread is the intuition about the current state of code. Those who support this change think that yes indeed, the vast majority of assertions are currently “buggy”, in that the statements were intended to always execute and the author wasn’t thinking (or didn’t know) about -O.

Whether this is even “buggy” feels a bit nebulous to me. One answer is “this code wasn’t written for running under -O” and another answer is “it runs fine, but you might see other problems when not guarded by the assertions”. I can’t imagine code that actually needs assertions to go off for it to function properly [1].

An alternate proposal in the old thread was to go the other way: turn off assertions by default and add them to -X dev, so that people using assertions incorrectly have to change, rather than whatever unknown % were doing things right the whole time. I think that makes sense, and I count myself among those who were Doing it Wrong™ [2].


  1. but I’m sure some cursed library is doing it somewhere ↩︎

  2. not that I have ever run python with -O in my life ↩︎

2 Likes

Absolutely. But the #if preprocessor guard is conditional to the NDEBUG proprocessor symbol, not to the parameter passed to the assert() macro. Thus, the parameter passed to the assert() macro are never seen by the compiler when the compilation is done with the -DNDEBUG flag. And this contradicts what you wrote a few messages above:

When NDEBUG is defined, the assert() macro disappears from the code compiled by the compiler.

I’m writing this not to be pedantic, but because one one of the arguments you are using to support the assertion that the Python’s assert statement works in a counterintuitive way is that it works differently than in other programming languages. I’m trying to show that it works exactly in the same way in C and C++.

2 Likes