Json.register()?

Well, I just realized that no one stops me to do this horrible thing:

>>> original_json_dumps = json.dumps
>>> _sentinel = object()
>>> 
>>> def my_json_dumps(*args, **kwargs):
...     if args:
...         obj = args[0]
...         newargs = args[1:]
...     else:
...         obj = kwargs.pop(obj, _sentinel)
...         
...         if obj is _sentinel:
...             original_json_dumps()
...     
...     if isinstance(obj, CustomTuple):
...         return original_json_dumps(tuple(obj), *newargs, **kwargs)
...     else if [...]
...     
...     return original_json_dumps(obj, *newargs, **kwargs)
... 
>>> json.dumps = my_json_dumps

:stuck_out_tongue:

Because it’s global state, and the alternative, as I’ve shown, is straightforward. I also (as a library author) don’t like the idea that a simple json.dumps() call might do something I wasn’t expecting because of something someone added to a registry (even though there’s plenty of other ways people can cause unexpected effects in my code).

Why are you so insistent that a registry is such a good idea? Is it that important to you that you can alter the meaning of json.dumps() calls that other people’s code makes?

3 Likes

Not so straightforward, you have always to specify the cls parameter, and this is annoying.

I’m not insistent, I just proposed it and I’m discussing it. I’m open to change my mind.

I think it’s a good idea because in my programmer’s life I wrote a lot of custom class that needed custom json serializer, and I found it very annoying to pass them every time. So usually I create a to_json() method, that sounds quite hackish to me, but it works. I believe this will be much more easy if you can simply register your own serializer.

I have to agree with @steven.daprano about the (very bad) issue for library authors. The only way this could move up to “marginally acceptable” is if it came with a mechanism for library authors to forcibly disregard a registry entirely when they use the APIs internally (but that’s also pretty annoying to have to spend effort worrying about, to be honest).

Perhaps a compromise is to add a registry class that the user can register encoders to, but not create global state. Instead, one can pass the desired registry (stored locally) in as a parameter, which only needs to be done in one place thanks to functools.partial.

Example, in a utils module:

import functools
import json

registry = json.Registry()
registry.register(SpamType, SpamEncoder)
registry.register(EggsType, EggsEncoder)

jsondump = functools.partial(json.dumps, registry=registry)

And then in another module:

import utils  # the module above

...

def myfunc(spam):
    ...
    json = utils.jsondump(spam)
    ...
1 Like

No need to add registry= option. JSON has default option. json.Registry() object can work as default callable.

1 Like

But how will you deserialize? I’m skeptical of making extensive customisation to json.dumps without having a concept for how json.loads will get an object of the original type back.

The same way we deserialize right now: by supplying a custom decoder to json.loads.

How do we know which decoder to use? (A registry won’t help.) Same way we do now: by tracking the information out of band.

I think this deserialization problem is a red herring. It is no worse than the deserialization problem for json.dumps(cls=Encoder), which has the solution json.loads(cls=Decoder).

TatSu uses asjson() (which should probably be as_json_compatible()), and honors def _json__() when converting the argument to something compatible with json.dumps().

There’s no support for deserialization there.

I’ve used the same strategy elsewhere with support for deserialization.

I don’t think that the json module should be burdened with what seems as program-specific requirements. Developers can write their customized to_json() and from_json() functions.

1 Like

The problem is always that you have to specify every time the default argument. It’s the same problem that you have with cls.

Well, calling a simple register an “extensive” customization seems to me an exaggeration.

About json.loads, as we already said, can’t have a registry, for the simple fact that JSON does not have a default field in which you can store the object type, as pickle does. As I already said, json.dumps and json.loads are not symmetric, and my propose is only for json.dump(s)

And pickle? __reduce__ make it program-specific?

No, but the plain json.register(type,encoder) suggested uses a single
global registry for all JSON encoding. A disaster.

The existing JSONEncoder class has a variety of hooks for special
encoding of things. Maybe it should be easier to use somehow, but at
least its use is overt and not shared in an uncoordinated way between
multiple unrelated JSON-using libraries.

Cheers,
Cameron Simpson cs@cskk.id.au

3 Likes

With __reduce__() the class is following a protocol with effect on a limited context (instances of the class). The same could be had for JSON with a __json__() protocol.

A global registry risks unintended effects on any library used by a program (you could register object).

For reference, this is what is currently done in TatSu:

def asjson(obj, seen=None):  # pylint: disable=too-many-return-statements,too-many-branches
    if isinstance(obj, Mapping) or isiter(obj):
        # prevent traversal of recursive structures
        if seen is None:
            seen = set()
        elif id(obj) in seen:
            return '__RECURSIVE__'
        seen.add(id(obj))

    if isinstance(obj, (weakref.ReferenceType, weakref.ProxyType)):
        return f'@0x{hex(id(obj)).upper()[2:]}'
    elif hasattr(obj, '__json__'):
        return obj.__json__()
    elif is_namedtuple(obj):
        return asjson(obj._asdict(), seen=seen)
    elif isinstance(obj, Mapping):
        result = {}
        for k, v in obj.items():
            try:
                result[k] = asjson(v, seen)
            except TypeError:
                debug('Unhashable key?', type(k), str(k))
                raise
        return result
    elif isinstance(obj, uuid.UUID):
        return str(obj)
    elif isinstance(obj, enum.Enum):
        return obj.value
    elif isiter(obj):
        return [asjson(e, seen) for e in obj]
    else:
        return obj
1 Like

pickle has a global register for encoders. You can register your own encoder with copyreg:

So it seems to me that a global register is not a disaster at all.

Third party libraries are, well, not json? They do not use json and so why they will be affected by a global register for json?

Unintended effects: If you fear unintended effects by third party code, don’t use Python:

json magic method: If we introduce __json__(), why not __xml__()? And __yaml__()? And so on.

global registry: see my previous post.