Extend type hints to cover exceptions

kevna · February 11, 2023, 9:40pm

PEP 484 introduced type hints, at this time documenting exceptions was left to docstrings. I seek to suggest a reason this feature might be desirable along with how it might be used.

Error handling in python does an excellent job of keeping the error-path out of the way when writing the normal flow of logic, however for larger code bases it is not always clear what exceptions may be caused by calling existing code. Since these cases are easily missed they may reach a higher level than intended before being handled, or even cause the entire program to end prematurely. The developer that doesn’t spot these isn’t likely to catch them with unit-testing. A sufficiently rare exception may even make it to production code before being discovered.

I propose adding an annotation to document exceptions that are explicitly raised within the function or called code to facilitate static-analysis to check that exceptions are handled.

Consider this contrived example:

class DbAdapter:
    def get_records(self, query: DbQuery) -> List['DbRecord'] ^ ResourceNotFoundException: ...

Here ^ serves the same purpose as the java “throws” keyword.

A static-anlysis tool such as mypy would check the type of exceptions against this annotation as it does for return. Checking both raise and the signature of called functions, warning for an exception that isn’t caught or annotated.

Rosuav · February 11, 2023, 9:52pm

No, please no. We do NOT need checked exceptions.

Code should be written to catch what it understands, and ignore everything else. Checked exceptions mean that it’s a breaking API change to raise a new exception - even indirectly - because every caller has to wrap up all exceptions to keep everything safe.

That kind of concept works fine when it’s error return values (make sure you handle every error or pass it up the line), but the entire point of exceptions is that you do not have to do that.

kevna · February 11, 2023, 10:11pm

Type hints are already entirely opt-in, as is the static analysis that checks it. Tools like mypy have configurations and comments to turn off checks project-wide; by module, function and line.

It could offer a lot for maintainability for people running production services, indeed in those cases raising a new exception already is a breaking change, just one you don’t find until the error is raised.

There must be some middle ground between returning errors and never seeing them until they bite you in the heinie. I’d love to hear your suggestions.

steven.daprano · February 12, 2023, 1:51am

Checked exceptions are widely (but not universally) recognised by the programming community as a terrible idea that causes more problems that it solves.

Very few languages have copied Java by introducing checked exceptions. Probably the most well known one is Nim, and they recommend against using exceptions at all.

If you have never heard of Nim, consider that checked exception languages only get more obscure from there.

Popular mainstream languages influenced by Java such as C#, Scala and Koitlin have left checked exceptions out. For example, one of the C# designers explained why.

Even in the Java community, I have never come across anyone who actively loves checked exceptions, attitudes tend to vary from “well they’re not so bad, you’re just using them wrong” to dislike and worse:

I find it telling that even defenders of checked exceptions will often complain that they are hard to use and often misused. This stackoverflow comment gives Haskell as an good example of checked exceptions done right, with the minor complication that they aren’t exceptions at all. :-/

In fairness, not everyone thinks they are an outright mistake but even the supporters acknowledge that they are very difficult to use correctly.

steven.daprano · February 12, 2023, 2:36am

The fundamental problem with checked exceptions in Python specifically is that in the general case, no guarantees can be made about what exceptions are raised.

A static type checker would be reduced to checking whether or not every method and function is declared to raised BaseException, and if not, complain.

Think about a simple example like checking len(obj) for the exception. What guarantees can we make about it? Precious few.

Sure, we can guarantee that if obj is an int or a float, it will raise TypeError; but if obj has a __len__ method, then it could raise any exception at all.

Here is your contrived example again:

class DbAdapter:
    def get_records(self, query: DbQuery) -> List['DbRecord'] ^ ResourceNotFoundException: ...

How can you, as the author of this DbAdapter class, guarantee that ResourceNotFoundException is the only exception that method raises? You can’t, not easily. You would need to wrap the entire method in a try…except, catch every exception, and convert it into a ResourceNotFoundException – and that makes debugging much harder.

(Perhaps a little less so now that Python has chained exceptions, but even so.)

Even if you do this successfully, without any bugs that might leak another exception:

The boilerplate needed is tedious and would need to be repeated over and over again.
Its a terrible API for the developer.

This turns fine-grained, detailed exceptions that allow you to rapidly find and fix bugs, like AttributeError or IndexError, into generic “ResourceNotFound” exceptions which lie about the nature of the bug.

Anything which encourages the proliferation of the catch everything anti-pattern into every method and function should be resisted.

storchaka · February 12, 2023, 7:19am

KeyboardInterrupt and MemoryError can be raised at virtually any place.

Also, exceptions like OverflowError, UnicodeEncodeError and UnicodeDecodeError are common when pass int, float, str or bytes argument. Static type checking controls types of arguments, but not their values.

kevna · February 12, 2023, 10:21am

Indeed, python may implicitly raise any exception at any time, but I’m talking primarily about what is explicitly raised in the codebase.
Though it occurs to me that if you were particularly like to implicitly raise, you could annotate it - such as a loop on input() which is likely to receive KeyboardInterrupt.

I could also declare every function as -> Any to defeat the purpose of type hinting. Many of the arguments in this thread I’ve also heard against type hinting.

Doesn’t mypy already warn if a type doesn’t implement a __method__ you’re requiring of it? It could therefore check the signature of that method. If you know your type is likely to be unsupported you’d typically be wrapping it in a catch as closely as possible.

Not currently, but the static analysis I described would do exactly this (for explicit exceptions). Currently any check of this kind can’t be done at all.

An actual database adapter would likely have a bunch of possible errors. As for any other type hint you could use a union to describe multiple possible errors, this could get long, but existing type-aliases already solve for that.

Exactly, this is what I too am trying to avoid. The codebase I have the good fortune to maintain with is littered with catch-everything where it should be catching one or two specific exception types.
This is because, in order to know what errors are explicitly raised the developer would have had to read tens, perhaps hundreds of thousands of lines of code, including dependancies. Since this is impractical and raising out is unacceptable they use the only safe option.
Annotating the exception means you know exactly which ones propagate to your new code, allowing you to handle only those.

I’m trying to present an ability to do this in a way that interferes as little as possible with the freedom python affords it’s developers, just as the opt-in nature of type hinting does for typing.

Rosuav · February 12, 2023, 12:20pm

Polymorphism would like to say hello.

def english_join(items: list) -> str:
    n = len(items)
    # if 1, as-is; if 2, "x and y"; if 3+, "a, b, c, and d"
    # Left as exercise to reader; remember that your goal is
    # to abuse match/case in the most amusing way possible.
    # Yes, I know this would usually be annotated as taking
    # a Sequence, but even with a concrete annotation, this
    # won't do what you want it to.
    return "TODO"

class DatabaseList(list):
    def __init__(self, query): self.query = query
    def __len__(self):
        with database:
            return database.do_query(self.query, count_only=True)
    def __iter__(self):
        with database:
            data = database.do_query(self.query)
        return iter(data)

print(english_join(["lions", "tigers", "bears"]))
print(english_join(DatabaseList("select name from users")))

How can DatabaseList report issues with the query? list.__len__ is not annotated as raising DatabaseDisconnectedError, so either a subclass can add exceptions (which breaks checked exceptions), or it can’t (which forces DatabaseList.__len__ to swallow the exception).

Or run the code, see a problem, and then you know what you’re trying to handle. In general, exception handling falls into one of two categories:

Catch something you know about and can actually handle. You have specific code for a specific exception. Find out which exception(s) to handle by simply NOT handling them, letting the exception crash the app, and then reading the traceback.
Catch everything, log it, and continue. Good for boundary conditions (eg an HTTP request handler). Catch something very high in the tree (Exception or even BaseException) and do the same thing with every exception: log it (with traceback), maybe report a failure (like an HTTP 500), and return to some main loop.

I’ve never needed to wade through someone else’s source code to figure these things out. Common exceptions may be documented (or incredibly obvioius); uncommon exceptions are found when they happen.

If your codebase is full of catch-everythings that don’t log their exceptions, then I’m very sorry for you, but the solution isn’t checked exceptions - it’s better exception-handling policies.

Numerlor · February 12, 2023, 5:19pm

While I wouldn’t want checked exceptions for forcing the handling, I think providing some way to attach exceptions information into annotations could be helpful.

I’m not sure whether type checkers could make any use of the information, but other tools could benefit from it for e.g. providing suggestions. Exceptions have also been a major pain for me as they’re largely undocumented even when raised directly (for both the stdlib and third party packages), so it’s necessary to either go through Exception and possibly go through recoverable exceptions, or looking into the code to see what can be done.

Some standardized way of specifying what should at least directly be expected from a function (and is intended to be a part of its interface) would help in documenting the exceptions and making it more prevalent

Rosuav · February 12, 2023, 5:24pm

That’s a far far smaller set than the exceptions that could be raised from that function. Usually that sort of thing is in the docstring; no idea how valuable it would be to have it in a machine-readable form. For example, list.remove says that it raises ValueError if the value is not present; OTOH set.add doesn’t say that it can raise TypeError if the value isn’t hashable.

So by definition, this is always going to be incomplete. That doesn’t mean it won’t be useful, but I do doubt the value of dedicating syntax to it.

Numerlor · February 12, 2023, 5:33pm

Yes, specifying everything would be an already lost battle, but at least imo the partial information could be useful. Like you said it’s usually in the docstring, but I’ve run into a lot of apis that simply skip documenting the exceptions even if they have their own ones so having something dedicated could help in spreading the practice.

For the syntax I think even something like a Raises type could be added, that’d be used similarly to Annotated. For example def try_something() -> Raises[ReturnedType, ExceptionType1, ExceptionType2]

apalala · February 12, 2023, 7:28pm

There are strong arguments made by the designers of C# against declaring raisable exceptions. Other language designers and software engineers have had alike findings.

https://www.artima.com/articles/the-trouble-with-checked-exceptions

As others have explained on this topic, checked exceptions lead to programmers catching exceptions not to properly handle them but just to get rid of the warnings.

Known exceptions emanating from a module called directly may be handled if there’s something reasonable to do about them. Otherwise all exceptions should be let through.

kevna · February 13, 2023, 12:56am

On the other hand, there are other new (and popular) languages where the designers have implemented returned errors that force you to handle them such as go and rust.

As it happens I’m a big fan of the rust style: returned errors with helpers for common handlings such as returning it on up the stack. But rust is very static and strict and different from python, clearly that model wouldn’t have fit in the python design.

But the opposite extreme of having errors completely hidden can cause problems too. The zen says “explicit is better than implicit” and I think that a little application of that here is at least worth considering.

I see a lot of people saying every un-anticipated exception should be let through but this seems idealistic - downtime isn’t free and these costs are avoidable. Certainly documentation plays a part in the solution, but I think this offers more than that alone.

steven.daprano · February 13, 2023, 1:35am

Annotations are used for type-checking. Usually statically, but some third-party libraries also use them dynamically. The only reason to put exceptions into annotations is if you intend to check them, i.e. checked exceptions, which you and I agree is not desirable.

Not every problem is a nail to be hammered with annotations. If you want to document exceptions, document them.

That would be checked exceptions which you just agreed would not be desirable.

Dear gawds please tell me you don’t actually do that.

If only we had some standard place to document functions. Something like a documentation string or similar. Perhaps we could make it a dunder, like __strdoc__ perhaps.

Numerlor · February 13, 2023, 2:13am

Not every problem is a nail to be hammered with annotations. If you want to document exceptions, document them.

It’s a standardized way of attaching information to functions, which the exceptions would be. How is it any different from the other uses of annotations? The Annotated type I mentioned already presents a type specifically made to not be type checked.

Dear gawds please tell me you don’t actually do that.

There is code where you can’t afford to let an exception bubble up to the top. Logging is nice and all, but also needs some way of making that data available to you if it’s not used only in you control. So it’s either relying on user reports, or on an internet connection being available and consent from users to send the data over.

Then there’s exceptions that could be nicely handled if the programmer knew about them, but they won’t because they’re not documented, or were skipped over in the text docstring. As an example, if the application presents information about encountered errors to the user, knowing more about them allows you to present the information to the user if it’s something their input caused.

I personally have run into a situation (I think from the os module?) where I was surprised by an exception from a function call where I was already handling OSError and only expecting that.

If only we had some standard place to document functions. Something like a documentation string or similar. Perhaps we could make it a dunder, like __strdoc__ perhaps.

Again, making it an explicit feature could more widely promote the idea of letting users know what you’re raising from the function and can be expected. A lot of docstrings are also done in their own style which is only useful to human readers and can cause the information to be read through.

Rosuav · February 13, 2023, 2:23am

And when that exception happened, it would have been caught by a generic handler (possibly the one that terminates the app), which tells you what the exception is. That is the job of generic handlers. It is then up to the programmer to decide whether the exception needs other sorts of handling.

The novice thinks that his job is to stop the program from crashing.

Numerlor · February 13, 2023, 2:41am

And when that exception happened, it would have been caught by a generic handler (possibly the one that terminates the app), which tells you what the exception is. That is the job of generic handlers. It is then up to the programmer to decide whether the exception needs other sorts of handling.

It’s up to the programmer, but in the first place the programmer has to know about it. By the time the generic handler triggers, some action that was expected to succeed and possibly could’ve continued on has been stopped. A user will also be left clueless as the best they can get from it is a “An error has occurred” because the exception information is only useful to developers.

Rosuav · February 13, 2023, 2:51am

If it could have continued, isn’t that what the generic handler is for?

steven.daprano · February 13, 2023, 7:28am

Who says that the user isn’t a developer? I hear a rumour that some software developers have been known to use software themselves.

Even if they are an end-user, a traceback is still useful: they can google it, they can report it as a bug, they can ask Reddit or their local admin.

The only way they get a “An error has occurred” is if the application or library developer suppresses the actual exception (possibly in order to satisfy some type checker) and raises a generic runtime error with a content-free message.

That is precisely one of the problems with checked exceptions – it encourages developers to suppress the actual exception in order to satisfy the type checker) and raise a generic “An error has occurred” error in its place.

Do you want meaningless, pointless generic runtime errors that are nearly impossible to debug? This is how you get them.

steven.daprano · February 13, 2023, 8:37am

Exceptions which are explicitly raised in the source code are (in general) only a small fraction of the exceptions that can be raised. It is only a small exaggeration to say that the only line of Python code which cannot raise an arbitrary exception is pass. An exception checker that only considered explicit raise statements would be as useful as a type checker that only tracked the types of variables and functions starting with “Q”.

If all it took to satisfy the type checker was to not raise an exception, they wouldn’t be checked exceptions!

You misunderstand my objection. I’m not saying that people would declare they raise BaseException to defeat the type checker. I’m saying that, in general, they would be forced to in order to satisfy the type checker.

Suppose you call a function that explicitly raises (let’s say) CoffeePotOutOfCoffeeError. Your function can’t deal with that, only the main application function can (by reporting it to the user), so your function should just let the exception pass.

But it can’t, because the type-checker sees you are calling a function that raises CoffeePotOutOfCoffeeError, so you have to satify the type-checker by either catching it and dealing with it (which you can’t do!) or by declaring that your function may raise CoffeePotOutOfCoffeeError.

Now every function that calls your function has to do the same. That CoffeePotOutOfCoffeeError declaration will spread virally from function to function, as every function which calls yours directly or indirectly will need to declare they can implicitly raise it in order to satify the checker.

Unless you catch the exception and throw it away, thus leading to your end-users suffering accute caffeine withdrawal when they try to make a coffee and discover too late that the coffee pot is out of coffee.

And of course its not just CoffeePotOutOfCoffeeError, the same argument applies to every exception which you cannot deal with and would like to just pass through to your function’s caller to deal with (which will be the majority of exceptions). You either declare it, or catch and suppress it. Now that might not literally be BaseException, but it will typically be a pretty large subset of exceptions.

I’m not talking about the case where an object doesn’t define __len__. Of course mypy and other type checkers should be able to recognise that and flag it. I’m talking about objects which do define __len__. In that case that method almost surely could raise anything.

Remember, it’s not just what is explicitly raised inside the method. It is also what could be implicitly raised by any function or expression called inside the method, and anything called by those functions, and so on. E.g. if anything in your method directly or indirectly divides, then your method can raise ZeroDivisionError.

You have my sympathy, but I think you have the wrong solution.

We know from the experience of Java, and the refusal of many, many smart language designers to follow in Java’s footsteps and introduce checked exceptions, that static checking of exceptions is an anti-pattern.

We also know that the right way to find out what undocumented exceptions can be raised is by extensive testing. Tests, tests and more tests.

If you disagree (as is your right of course) I think you need to demonstrate practical examples of languages where checked exceptions work well, without the contraversy of Java checked exceptions. What do these languages do differently?