Because it’s global state, and the alternative, as I’ve shown, is straightforward. I also (as a library author) don’t like the idea that a simple json.dumps() call might do something I wasn’t expecting because of something someone added to a registry (even though there’s plenty of other ways people can cause unexpected effects in my code).
Why are you so insistent that a registry is such a good idea? Is it that important to you that you can alter the meaning of json.dumps() calls that other people’s code makes?
Not so straightforward, you have always to specify the cls parameter, and this is annoying.
I’m not insistent, I just proposed it and I’m discussing it. I’m open to change my mind.
I think it’s a good idea because in my programmer’s life I wrote a lot of custom class that needed custom json serializer, and I found it very annoying to pass them every time. So usually I create a to_json() method, that sounds quite hackish to me, but it works. I believe this will be much more easy if you can simply register your own serializer.
I have to agree with @steven.daprano about the (very bad) issue for library authors. The only way this could move up to “marginally acceptable” is if it came with a mechanism for library authors to forcibly disregard a registry entirely when they use the APIs internally (but that’s also pretty annoying to have to spend effort worrying about, to be honest).
Perhaps a compromise is to add a registry class that the user can register encoders to, but not create global state. Instead, one can pass the desired registry (stored locally) in as a parameter, which only needs to be done in one place thanks to functools.partial.
But how will you deserialize? I’m skeptical of making extensive customisation to json.dumps without having a concept for how json.loads will get an object of the original type back.
The same way we deserialize right now: by supplying a custom decoder to json.loads.
How do we know which decoder to use? (A registry won’t help.) Same way we do now: by tracking the information out of band.
I think this deserialization problem is a red herring. It is no worse than the deserialization problem for json.dumps(cls=Encoder), which has the solution json.loads(cls=Decoder).
TatSu uses asjson() (which should probably be as_json_compatible()), and honors def _json__() when converting the argument to something compatible with json.dumps().
There’s no support for deserialization there.
I’ve used the same strategy elsewhere with support for deserialization.
I don’t think that the json module should be burdened with what seems as program-specific requirements. Developers can write their customized to_json() and from_json() functions.
Well, calling a simple register an “extensive” customization seems to me an exaggeration.
About json.loads, as we already said, can’t have a registry, for the simple fact that JSON does not have a default field in which you can store the object type, as pickle does. As I already said, json.dumps and json.loads are not symmetric, and my propose is only for json.dump(s)
No, but the plain json.register(type,encoder) suggested uses a single global registry for all JSON encoding. A disaster.
The existing JSONEncoder class has a variety of hooks for special
encoding of things. Maybe it should be easier to use somehow, but at
least its use is overt and not shared in an uncoordinated way between
multiple unrelated JSON-using libraries.
With __reduce__() the class is following a protocol with effect on a limited context (instances of the class). The same could be had for JSON with a __json__() protocol.
A global registry risks unintended effects on any library used by a program (you could register object).
For reference, this is what is currently done in TatSu:
def asjson(obj, seen=None): # pylint: disable=too-many-return-statements,too-many-branches
if isinstance(obj, Mapping) or isiter(obj):
# prevent traversal of recursive structures
if seen is None:
seen = set()
elif id(obj) in seen:
return '__RECURSIVE__'
seen.add(id(obj))
if isinstance(obj, (weakref.ReferenceType, weakref.ProxyType)):
return f'@0x{hex(id(obj)).upper()[2:]}'
elif hasattr(obj, '__json__'):
return obj.__json__()
elif is_namedtuple(obj):
return asjson(obj._asdict(), seen=seen)
elif isinstance(obj, Mapping):
result = {}
for k, v in obj.items():
try:
result[k] = asjson(v, seen)
except TypeError:
debug('Unhashable key?', type(k), str(k))
raise
return result
elif isinstance(obj, uuid.UUID):
return str(obj)
elif isinstance(obj, enum.Enum):
return obj.value
elif isiter(obj):
return [asjson(e, seen) for e in obj]
else:
return obj