PEP 678: Enriching Exceptions with Notes

encukou · February 10, 2022, 10:16am

I can’t speak for friendly-traceback, but it seems that with it would currently have trouble identifying where one note starts and another ends, so it wouldn’t be able to translate individual pieces (and leave unknown ones alone).
Or a GUI would have trouble hiding/collapsing individual notes.

Was a tuple of strings considered? The way to add a note could become something like:

@contextlib.contextmanager
def add_exc_note(note):
    try:
        yield
    except Exception as err:
        err.__notes__ += (note, )
        raise

That would have an additional benefit: writing a composable note-adding library (one that plays well with others) would be easier to do than one that removes other notes.

IMO, the expectation of what libraries should do when adding notes is pretty important to get right, even if the PEP says it’s not “a core part of the proposal”. Where else should it be specified? I don’t think interoperation with other libraries should be left to each individual library. Today’s libraries that want to set notes might be “top-level” – you probably won’t combine asyncio, trio and Hypothesis in unpredictable ways – but that’ll change if exception groups are successful.

aroberge · February 10, 2022, 11:52am

I think adding such a caveat might be appropriate for this PEP. From my point of view, something like the following might be the best:

programming environments for novices can provide more detailed descriptions of various errors, and tips for resolving them (e.g. friendly-traceback, albeit without translation; see [3]).

And, at the bottom:

[3] While outside the scope of this PEP, if desired by a particular project, support for translation could be added by an additional custom field that would not be shown by the standard Python traceback mechanism; perhaps something like

_translatable_note = lambda: _("Some translatable text.")

= = =
Granted, this would not permit to compose notes (as Petr Viktorin pointed out). However, this would allow the user to focus on solving one problem at a time, without being overwhelmed by too much information.

iritkatriel · February 10, 2022, 1:31pm

@aroberge
I’m still confused about the translation problem. I understand that the note added to the standard traceback would not be translated, only the outputs of various friendly-traceback functions would be (is this correct?). In that case, why does the text need to be repeated in a lambda attached to another attribute on the exception? Could the friendly traceback code not invoke translation on e.__note__, by calling _(e.__note__) or something along those lines?

@encukou
I think a tuple of strings might be a good idea, to keep notes separated until it comes time to display them (I think you’re right that people will add more notes than we expect, and not only with exception groups).
Though, it would be just as easy to wipe out the existing notes with err.__notes__ = (note, ) . Should we consider an add_note() function on BaseException? Then __note__ can be a read-only attribute that returns a tuple.

encukou · February 10, 2022, 2:29pm

To me, that doesn’t sound like something to guard against.
add_note might be easier to implement if strict type checking for str is needed at the C level… but then again, that check might be deferred to when the note is displayed.
But that’s getting too deep into the implementation, you’re the expert there.

aroberge · February 10, 2022, 2:36pm

My initial response was to write “No, and here’s why…” … but as I wrote and edited the detailed answer below, I realized that, in a realistic scenario, it’s more like “Actually, something like this might work, but in a very slightly different way.”

And perhaps, this is enough to justify not adding the end note I mentioned in a previous response to Zac. This is likely the main point of this reply, and most people can stop reading here.

Apologies to those that were inundated by my previous not-as-well-thought-out replies as I’d hope to provide.

===

I was blinded by the fact that, in friendly-traceback, I keep adding more and more cases (more than 600 translated strings so far, with many more to come), and different translations are updated at irregular rate: so, I consider it important to give multilingual users the possibility to change their preferred language at any time so that they can find the translation that is the most helpful to them.

In this more realistic scenario, for a third-party library, the number of strings to translate would likely be much smaller, and would not have translators struggle in trying to keep up with a constantly changing number of strings to translate. In such a scenario, e.__note__ could be a string and everything would work fine.

However, your suggestion of having friendly_traceback invoke _(e.__note__) would definitely not work. To understand it all, I need to describe how gettext works, probably going into way too much details.

Imagine that you create a simplified Turtle library for an international audience. You write custom error messages and you and your team intend to add translations of these error messages. [Suppose one error message is "A red turtle cannot turn left."]

You surround every string to be translated by a function call _(). You then use pygettext to scan your files so that every translatable string can be added to a template file (extension .pot). For gettext, a translatable string is one that is an argument of _(). So, e.__note__ in the source code is not a string that is translatable. However, e.__note__ = _("A red turtle cannot turn left.") does contain a string that is translatable within your library.

From the template file, you and your team create a corresponding translation file (extension .po) for each language that you support. In addition to being used by gettext to identify the strings to be translated, you define the _() function within your own package so that it can find your translation files.
You package everything together (python files and language files [.po and also .mo]) and upload it to pypi.

In the description of your Library you would likely say “our package includes English, French, and Italian versions”. You also give your users the possibility of choosing their preferred language for your own module/package. This is something that they have to set themselves, most likely as the very first instruction. A Spanish speaker might decide they prefer Italian over the English default.

friendly-traceback defines its own _() function. When it is invoked, it looks in its own collection of translations to find the appropriate string. I can assure you that its own collection definitely does not and will not include the string "A red turtle cannot turn left.". So, it cannot use its own _() function to take the content of an untranslated e.__note__ and provide a translation.
However, hopefully your library would already have create the appropriate translation to e.__note__ by this point.

iritkatriel · February 10, 2022, 6:33pm

I don’t think it’s harder to type-check with/without add_note, it’s really a matter of which API works better for users of notes.

barry · February 10, 2022, 6:47pm

This is actually an important point. If you’re building an i18n’d application, you must only include full sentences as translatable strings. There are two reasons: some languages change the order of placeholders, and some languages essentially cannot translate sentence fragments. So unless you were really careful about how you concatenate notes, you might just end up with untranslatable strings there.

barry · February 10, 2022, 6:50pm

This is why I suggested that an object implementing __str__() should/could/would be allowed. If you’re going to complicate things enough to handle the case of anything more than just a str, then I think it makes sense to thing of a better mechanism.

That said, if the PEP accepts not just concrete str objects, but also subclasses of str, then maybe that would be enough to implement more complicated applications?

guido · February 10, 2022, 6:55pm

I would rather revert the feature than make it as complicated as some folks propose. lambda returning str, , something with __str__ method, tuple of strings… Horror!

barry · February 10, 2022, 7:01pm

At least in my experience, that would most likely be something like:

def set_note(e: Exception, color: str, direction: str):
     e.__note__ = _('A $color turtle cannot turn $direction')

The string that translators would have to translate is "A $color turtle cannot turn $direction" and translators know that $color and $direction are placeholders and they are free to change the order of those placeholders based on their language’s grammar etc.

FWIW, the flufl.i18n language grabs its substitutions at runtime from the surrounding scopes, globals then locals, so you don’t have to repeat yourself. Yay for sys._getframe()!

iritkatriel · February 10, 2022, 7:21pm

Is that the scope where the note is created or the scope where the traceback is being rendered?

Remember that notes have the sole purpose of adding something to the interpreter’s builtin traceback display. They are not intended to be something that applications use for other purposes. Perhaps friendly-traceback should be removed from the use cases for notes altogether, since it is replacing the default traceback by something else.

I don’t think that allowing any-object-with-str is a solution re Petr’s concerns. The issue he raised is how different libraries would coordinate the use of note between them. So this calls for a very prescribed scheme, rather than a very permissive one.

aroberge · February 10, 2022, 8:30pm

Sorry, friend-traceback adds information to the default traceback.

Edit: one of its goals is to enable users to learn how to understand all the information contained in a standard traceback.

barry · February 10, 2022, 8:38pm

It’s generally the scope in which the _() function is called. There are some corner cases where that needs to be deferred to other scopes, but those aren’t common (e.g. when you want to mark a module global as translatable string for gettext but you need to translate and expand it later in some other call).

barry · February 10, 2022, 8:40pm

Personally I think so, but as I’ve said I’m wary of translatable exceptions for anything other than educational purposes.

iritkatriel · February 10, 2022, 11:06pm

Right, but you build your own representation of the augmented traceback. You’re probably not going to replace all of this with notes, whether we make them strings or tuples of strings or lambdas.

Zac-HD · February 11, 2022, 7:39am

__note__ is designed to have no semantics beyond “this should be shown to the user”. A string expresses this well, whereas a tuple is less ergonomic for the core use of displaying a string and invites complicated parsing schemas. If you want to extract part of a note later, stick a copy on another attribute! (both to translate, and for targeting str.replace)

As PEP author, I’m ruling translation out of scope for my proposal and will remove the example of friendly-traceback. While I see enormous value in localization, especially for beginners, it’s also clearly outside my expertise and more complicated than is tenable for __note__. Unless we have specific reason to think that __note__ will make future translation support more difficult (beyond the obvious “here’s another message to handle”), I will continue with the current proposal.

IMO, the expectation of what libraries should do when adding notes is pretty important to get right, even if the PEP says it’s not “a core part of the proposal”. Where else should it be specified? I don’t think interoperation with other libraries should be left to each individual library. Today’s libraries that want to set notes might be “top-level” – you probably won’t combine asyncio, trio and Hypothesis in unpredictable ways – but that’ll change if exception groups are successful.

To be more precise adding a utility function such as contextlib.add_exc_note() is not a core part of the proposal. Interoperation between libraries is a core part of the proposal - I say “We have not identified any scenario where libraries would want to do anything but either concatenate or replace notes”.

Concatenation vs replacement seems like a decision for each library, though if you wanted it we could add a new method BaseException.with_note(note, *, replace=False) as a nudge one way or the other - so long as we don’t nudge people away from exception chaining!

encukou · February 11, 2022, 9:42am

OK
But, do we agree that there can be multiple notes added to a single exception, and this PEP is the place to specify/recommend how that should work?

Why is it displaying a (single) string?
Lets say that in my GUI, I’d like to separate individual notes visually, with a horizontal line.
Should I parse the note to look for \n\n, and hope all libraries use that as a separator?

“Single string” is not the shape of the underlying data. I even think the opposite of what you say: serialization to a single string invites parsing schemas.

As for being ergonomic – a loop sounds like one more line (and indent level). In a function that is currently drawing ASCII diagrams, is that really an issue?

I can’t really do that if a third-party library is raising the exception.

If we have the user make a decision, the user should be likely to have some info that helps them make it. In this case, the right course of action depends on… what?
I can only think of things related to the pre-existing note or the intended display – nothing the author of asyncio or hypothesis would know.

iritkatriel · February 12, 2022, 9:46am

If we add this then ordinary users don’t need to interact with the dunder attribute anymore - they call this function and the string appears in the traceback. So we can make it read-only, rename it to __notes__ and let its value be a tuple of the strings. Specialized users (like ones requiring translations) can then get the individual notes from __notes__.

barry · February 12, 2022, 9:05pm

ObBikeshed: If this appends to the __notes__ tuple then it should probably be .add_note().

Zac-HD · February 13, 2022, 6:46am

Do we agree that there can be multiple notes added to a single exception, and this PEP is the place to specify/recommend how that should work? … I even think the opposite of what you say: serialization to a single string invites parsing

Yes, we agree. On reflection you’re also right about the data structure - “a sequence of zero or more notes” (which can be "\n\n".joined for display) is a better representation, which incidentally settles the concat/replace question as “you just add a note”, i.e. always concatenate.

I’m comfortable going with Irit’s proposal, if that resolves your concerns?

ObBikeshed: If this appends to the __notes__ tuple then it should probably be .add_note()

I’d agree, except for consistency with the .with_traceback(tb) method - which mutates and returns the exception.