Add attributes to tomllib.TOMLDecodeError

hukkinj1 · October 29, 2024, 10:13am

I propose giving tomllib.TOMLDecodeError the same attributes that json.JSONDecodeError has, namely: msg, doc, pos, lineno, colno.

Similar to json, tomllib currently formats error location in source document to the error message. It doesn’t, however, allow easy access to the location or unformatted error message, without failure prone parsing of the error message first. Error messages from libraries are generally geared towards developers, so user facing apps may want to show a customized error to users.

This PR demonstrates the proposed changes when applied to the Tomli backport library.

Unfortunately, this changes signature of TOMLDecodeError so breaks any users instantiating the error. Not sure why anyone would do that though? Some less breaking options are using keyword arguments, or dynamically injecting attributes after instantiation.

encukou · October 30, 2024, 10:57am

IMO, making the __init__ argument optional is the way, with the attributes defaulting to None.
That does mean users of TOMLDecodeError need to deal with the attributes being None, which means the API will still be different from JSONDecodeError. I don’t think that’s a huge issue, but if you think it is, it’s possible to deprecate using default values, and make the arguments required after several releases.

That’s not a good question to ask when you’re in stdlib, unfortunately. There’s more users, and if they have an issue they can’t just pin a dependency while it’s resolved.

hukkinj1 · October 30, 2024, 11:12am

Thanks Petr, that makes sense!

I don’t think that’s a huge issue, but if you think it is, it’s possible to deprecate using default values, and make the arguments required after several releases.

I don’t think that’s a huge issue either, but mildly annoying in that we cannot guarantee the types are not None, even though tomllib never sets them to None. So users who type check their code will forever have to prove type checkers that the attributes are not None. I’ll gladly add in the deprecation if you think it’s acceptable.

bschubert · October 30, 2024, 12:33pm

Would it be an option to add a subclass of TOMLDecodeError with the new attributes? That way, existing code that instantiates TOMLDecodeError won’t be broken, and new code that wants to use the new attributes can check whether they all exist (and are not None) with a single isinstance check.

hukkinj1 · October 30, 2024, 12:54pm

Definitely possible, but then we need to

Come up with a name for the subclass that is likely a worse name than TOMLDecodeError
Document both exceptions in the docs. Document that loads and load always raise the subclass, and people will forever wonder why the base class exists if it’s never really used

So perhaps not as elegant in the long term.

Nineteendo · October 30, 2024, 1:20pm

We could name it TOMLSyntaxError, but that might cause problems with a future proposal of me… Adding the attributes of SyntaxError and maybe [end_]colno.

I guess it’s easiest to just add a # type: ignore comment.