PEP 678: Enriching Exceptions with Notes

5 Likes

FYI the helper function discussion was had here between myself and I assume Zac, thanks for writing out an implementation of the helper function.

From the PEP:

code which retries an operation on error may wish to associate an iteration, timestamp, or other explanation with each of several errors - especially if re-raising them in an ExceptionGroup.

I mentioned this in the context of __note__ for EG’s but this makes me think of a use case for an object implementing a __str__(). Say I wanted to capture the number of retries in a note. My error message might be something like:

"Tried to commit this change 7 times"

Now imagine I need to update this note. I can’t carry around the number of iterations previously tried, so to increment this I’d have to parse the string to extract the number, increment it, then store the new string on __note__.

Instead a better way to do this would be to.store an object on __note__ that has an API to increment the count, and a __str__() that returns a formatted string containing the count.

This is just a thought experiment based on this comment in the PEP. I don’t have a specific use case in mind, although I can think of a couple of places where this might enhance exception information.

The PEP does not dive into the cost of et another layer of code around handling exceptions, and what to do if that code itself raises an exception. The last comment hints at the possible problems this could entail: what to do about format strings in the __note__ and other ways this feature could be extended/exploited.

In the rejected ideas section, there is the possibility to subclass exception in pure python. It seems like this would be something to try out before adding another layer of presentation to exceptions. Has a project implemented this as an interim fix and how did it go?

@quantzur, yep, that was me - I’ll add a link to that discussion in the PEP.

@barry, the idiom here would be to raise an ExceptionGroup where the inner exceptions had notes “First attempt to commit this change”, “Second attempt…”, and so on.

The attempt count can be tracked as a local variable, or if you want to use it elsewhere you can assign to e.g. an err._n_attempted_commits = 7 attribute on the exception object and avoid parsing!

@mattip, avoiding the risk of delayed formatting errors is another good reason to insist that __note__ must be a string or None! This doesn’t introduce any additional concerns about format strings, because the formatting of the note itself must be finalised before it can be added to the exception, and the traceback code only needs to deal with string as format arguments (which is safe).

The runtime cost amounts to an attribute assignment and typecheck if a note is assigned, and one branch in the traceback display code to show it. This is pretty cheap.

My personal motivation for writing this PEP was that I found it was impossible to support satisfactory error reporting in Hypothesis with ExceptionGroup. Each exception has additional data (such as the minimal failing example), which should remain closely associated. Currently we just print this information and each respective traceback before raising a MultipleFailures exception, but printing some data and then raising an ExceptionGroup splits up the relevant logs even if none of the inner exceptions are handled without being displayed to the user.

I would like backports.exceptiongroup to display __note__ for inner exceptions, but to display at all this has to be part of the exceptiongroup class or traceback-printing code. The latter seems much more elegant to me, and thus this PEP.

3 Likes

The PEP was submitted to the SC with a note I don’t see discussed:

Why require that __note__ may only be a string or None, instead of any object? Restricting to string-or-None simplifies and stabilizes the API, easing interop and avoiding the risk of delayed errors when calling __str__() . We can also loosen, but not tighten, this constraint without breaking backwards compatibility.

Once libraries start depending on __note__ being either str or None, I’m afraid that loosening won’t be much easier than tightening. See the add_exc_note function in the PEP: projects that adopt it would start failing if the constraint is loosened.

(I see that as a minor detail, the other arguments are pretty convincing.)

Since friendly-traceback is mentioned in the PEP, I can say that it would be much more useful for friendly-traceback if we could have something like:

__note__ = lambda: _("Translatable text")

instead of limiting __note__ to be a string or None.

In my experience (GNU Mailman was one of the first i18n’d Python applications), translating exception strings isn’t a good idea. They aren’t generally useful for end users and it makes it much harder to search the internet for help. To me, __note__ would fall under the same guidelines.

The original motivation for creating friendly-traceback [1] was to provide translations for error messages. That is a main reason why it got the support of the PSF [2]. It is true that friendly-traceback does much more than provide translations, but that is an important part of it.

The only reason I mention all this is that the PEP authors deemed it worthwhile to mention friendly-traceback as a possible use case in the PEP.

[1] friendly-traceback 0.5.19

[2] Python Software Foundation News: Grants Awarded for Python in Education

Fair enough. I can see its utility for educational purposes, but I’d be wary of releasing software that included translatable exceptions. I think it would be difficult to understand bug reports with translated exception messages.

I actually agree with your for exceptions from Python itself and code in the standard library. I could however see the use for packages like Pygame, and perhaps a third-party version of the Turtle module.

With Friendly-traceback, it is already possible for third-party projects to effectively add translatable information to exceptions, much more effectively than the proposed __note__ field. However, very few people know about it. The addition of a __note__ field might motivate people to look into providing translations for some important information, especially for packages directed at beginners.

I’m not familiar with how the translation works. Is there a reason why the translation cannot happen before the string is assigned to the __note__?

(Apologies for what is becoming a derailment of the original topic.) Your question is a very good one, which I have not seen discussed before. Usually, when people translate strings, the translation is done “all at once” before the information is shown to a user. The most common example is that of a website: you switch language and you effectively see the entire content in the new language. However, in some other situations, it makes sense to easily allow to change the language.

The approach I use in friendly-traceback is to use “on demand” translation. This allow a user to change language during an interactive session. Sometimes, either the translation of a particular phrase is not available in their preferred language (say Italian, which is currently on 10% complete) but might be available in a closely related language (like Spanish or perhaps French). Actual example of a missing translation:

Did you forget a colon `:`?
>>> set_lang('it')  # changing to Italian
>>> hint()
Did you forget a colon `:`?
>>> set_lang('fr')
>>> hint()
Avez-vous oublié d'ajouter les deux points `:` ?

This type of dynamic translation is not possible when a string is set at the source. And a user would not want to first switch language, then recreate the traceback entirely to be able to understand what was written in the note if the original did not make sense.

Commercial products usually only make complete translations available: either your language is entirely supported, or not at all. Open source projects rely on volunteers to provide translations and often partial translations are better than none at all; having the ability to look at other closely related language to your preferred one (if you don’t understand the original English message) can be helpful.

= = =
To go back to one of Barry Warsaw’s point: if I set French (my native language) as the default and get an explanation that doesn’t help me enough, I can switch language to English, type in a command to see the note content in English, and then do an Internet search with the English version of the error message.

= = =
Finally, I do note that this PEP was primarily motivated by adding explanations for chained exceptions (or re-raised) exceptions, in a standard concept. However, since friendly-traceback was mentioned in the PEP, I couldn’t leave the impression that having the note field be a simple string was really the best option for friendly-traceback.

Could friendly traceback save the original note in another field on the exception (say e._original_note) and then re-translate when a different language is selected as needed? How does it handle other parts of the traceback, like the exception str()?

(No need to apologize, you’re bringing up a valid point).

Currently, friendly-traceback shows the original error message untranslated. It then uses the exception type and the original message to figure out what other information to give to the user; all this additional information is translatable. Often, this information will include a rephrasing of the original error message that can be translated.

The way I do this, is to painstakingly add one example at a time of different error messages, which I then use a starting point. For runtime errors (as opposed to SyntaxError), I do some frame inspection to determine what objects are in play and try to figure out additional information from there. For SyntaxError, I try to see if I can change one or a few token to transform it into valid syntax.

The idea I have is to give a mechanism so that people can add translations to their own exception (via the note field). I could then have a way to probe to see if there is there might be some translatable content via something like:

if hasattr(exc, '__note__') and callable(exc.__note__):
   info = exc.__note__()
   # add info to what is given by friendly's why() function
...

The typical pattern would be something like:

Exception raised (during interactive session, including IPython, Jupyter, etc.)
>>> from friendly import why
>>> set_lang('fr')
>>> why()  # includes the information from note in French if possible.
[explanation given here]

If they don’t provide translation, then the information is shown as is - unless it is a case that I have already added to the known cases.

Since the exception was raised before friendly was imported, the content of the note field has already been set. If it is a function, then it can be potentially translated.

Thank you for explaining all this.

Is the info that is returned from the note function in your example supposed to be displayed by python’s built in traceback? If not, can you design your mechanism to use any other field name on the exception?

Yes, I would think that the information in the note function should definitely be shown by Python’s built-in traceback. Forgetting about friendly-traceback for a moment, I think that this addition of a note field is an excellent idea.

As for friendly-traceback, it could make use of any other field name added to the exception. However, I doubt that there would be very much support to add yet more “approved” attributes to the standard Python exception.

I don’t think you need the new field to be approved in any way, you just define it as the friendly traceback api.

The only reason we need the note to be approved as part of the language is because we want to change the interpreter’s built in traceback display code to display it.

A couple of other comments about translations. flufl.i18n is a library I’ve maintained for many years, and which was born from a refactoring of the original GNU Mailman code. There are basically two ways to think about translations, a “simple” API which supports translating simple applications like CLIs, where there’s only one language context in effect at a time. There’s another API which is a little more complicated (but I’d argue still elegant) for situations where you can have multiple translation contexts in play at any one time. In the case Mailman, imagine that you need to translate a page for someone taking an action on a web page, and that action generates a notification to two list owners and a list member. Now, imagine that all four actors of that action have different preferred languages. The service is crafting the notification in one place and then needs to send or display that notice in four different languages. flufl.i18n can and does handle this just fine.

Remember too that translations often have placeholders where runtime data needs to get interpolated into the post-translation string and that languages can and do reorder those placeholders. Maybe that data is only available at the point at which the translated string is being used to notify the user. So in general you have to delay both the translation and the substitution as late as possible to the point where you are actually going to use the final translated string.

This is a real-world example. I have no idea whether __note__ as it’s currently defined, or friendly-traceback can or even needs to handle cases like this.

@encukou: Once libraries start depending on __note__ being either str or None , I’m afraid that loosening won’t be much easier than tightening. See the add_exc_note function in the PEP: projects that adopt it would start failing if the constraint is loosened.

You’re correct of course, Hyrum’s Law is a harsh master :sweat:


@aroberge, thank you for your detailed comments on translation. I really appreciate your (and @barry’s) input, since I know relatively little about this.

Would it make sense to attach the __note__ as a string with the then-current language settings, and an additional e.g. _friendly_note attribute with the translatable string? This would show the best-guess translated note, but on set_lang() you could clear the current note and either re-translate it, or wait for the user to call why(). I don’t have much sense of whether this convenience-vs-rigidity would be an improvement overall though.

You also mention that “third-party projects to effectively add translatable information to exceptions, much more effectively than the proposed __note__ field. However, very few people know about it.”. I’d love to hear more! If you mean Registering custom error types - friendly-traceback 0.5.19 , I don’t think this can replace __note__ without an equivalent way to attach extra information beyond the exception type.

However, since friendly-traceback was mentioned in the PEP, I couldn’t leave the impression that having the note field be a simple string was really the best option for friendly-traceback.

If you would prefer, I’m happy to remove the mention of friendly-traceback or to add a caveat such as “(e.g. friendly-traceback, albeit without translations)”.


Overall, I’d prefer not to incur the complexity of deferring note evaluation to support translations, on the basis that this matches the status quo for exception messages and __note__ can be translated in the same way.

If the steering council would prefer a PEP which proposes built-in support for translation of error messages I’d be happy to support this, but do not have the expertise to lead such an effort.