I’ve often felt that the current CustomEncoder
approach to adding custom JSON serialization is a bit of an anti-pattern. The separation of class and serialization logic means that most libraries that wish to output json must also expose a parameter for overriding the default encoder, which can be complicated if the library already uses a custom JSONEncoder
internally. In cases where your application needs to serialize more than two or three custom classes within the same context, managing your custom encoder can become a non-trivial task. In the worst case, I’ve seen developers resort to monkey patching library functionality in order to pass a custom encoder to one of the library’s dependencies.
To solve this problem, I’d like to propose a __json__
dunder method. This would allow classes to describe how they should be serialized in much the same way that __repr__
allows them to describe how they should be represented in text format. This way it would become trivial to pass custom classes to other libraries. The change would be quite simple, rather than the JSONEncoder.default()
immediately raising a TypeError
, it would first check to see if the object has callable __json__
attribute. If so, it returns the result like so:
def default(self, o):
if callable(getattr(o, "__json__", False)):
return o.__json__()
raise TypeError(f'Object of type {o.__class__.__name__} '
f'is not JSON serializable')
This ensure backwards compatibility for libraries that are already using a custom encoder, as the custom encoder logic will have been executed before we get to this point. Additionally, the docstring of default
already instructs programmers to call JSONEncoder.default(self, o)
in the event that their custom logic cannot handle the provided input, so any custom encoder that was implemented according to the official guidelines would still automatically make use of this new functionality. Lastly, this change should have minimal performance impact, as it would only affect cases where the program would’ve otherwise thrown a serialization error. In my experience, such errors are not usually recovered from, so it seems unlikely that high-performance applications are out in the wild churning through such cases often enough that the additional callable(getattr(o, "__json__", False))
would noticeably impact performance.
I should note that when searching to see if this had already been suggested I found this topic:
In one of the comments introducing a __json__
protocol was mentioned off-hand, but I thought it still merited a separate topic. I agree with many of the comments that a global registry should be avoided, and I’m also not as concerned with de-serialization (after all, json is not pickle). This approach doesn’t cause weird side effects for libraries, and it still allows for serialization-time customization (e.g. it’s not uncommon to have two different custom encoders for datetime objects depending on the serialization context).