Json does not support Mapping and MutableMapping

I created a class that implements Mapping. Using json.dumps(mymap) I get

TypeError: Object of type ‘MyMap’ is not JSON serializable

I can do json.dumps(dict(mymap)), but it’s more slow. I can do json.dumps(mymap._dict), but it’s not elegant. So I think I’ll implement a MyMap.jsonload.

Anyway, is this behaviour normal? Why json module can’t serialize Mapping classes by default?

Yes, I believe this behavioral is normal. Only Python dicts can be directly translated to JSON objects in the json module, see the table in json.JSONDecoder(). I’m not sure if there are any technical limitations/reasons as to why Mapping and MutableMapping wouldn’t be supported though.

This might be worth opening an issue on bugs.python.org (check for duplicates first though). If you do, be sure to add the currently active core developers that are maintainers for the module to the nosy list, “rhettinger” and “ezio.melotti”.

Do not care about it. Iterating a Mappign is terribly slow compared to iterating dict anyway.

This is the best way when you care about performance.
Or you can have some public method (e.g. as_dict()) which returns self._dict.copy().

Using ABC instead of actual type is not easy as people think.

1. Since ABC is abstract, subclass, subclass relation is more complicated.

If json.dumps supports Mapping, why not support Sequence?
Then, strings are sequence. What a subtype of the string should be serialized?

2. There is a default option

json.dumps has the default option to customize how use types are serialized.
If we serialize all classes which are subtype of Mapping or Sequence, user can not customize how to serialize them anymore.

Custom Mappings can have more state than just what’s exported by the Mapping interface. Serializing as a dict would leave that out.

? Why? Usually who creates a Mapping uses a private dict attribute, and __iter__ usually returns iter(self._dict)

dict.copy() is the same of dict(mymap)… I ended with this method:

def jsonDump(self, fp=None, *args, **kwargs):
    mydict = self._dict

    if fp is None:
        return json.dumps(mydict, *args, **kwargs)
    else:
        kwargs.setdefault("indent", 4)
        
        try:
            fp.write
            return json.dump(mydict, fp, *args, **kwargs)
        except AttributeError:
            with open(fp, "w") as f:
                f.write(json.dumps(mydict, *args, **kwargs))

json already serialize any str

…why? o___0

So why dict.update(Mapping) works?

When iterating dict, json module iterate the internal structure of the dict direclty. (e.g. PyDict_Iter) It is very fast.
On the other hand, when iterating Mapping, it does:

for key in m.keys():
    value = mapping[key]
    serialize_map(key, value)

Since m.__getitem__ is implemented in Python, we need one Python call per iterating each item.

Calling Python method is much slower than iterating C array.

$ pyperf timeit -s 'd = dict.fromkeys(range(1000))' -- 'dict(d)'
.....................
Mean +- std dev: 18.8 us +- 0.2 us

$ pyperf timeit -s 'import collections; d = collections.UserDict.fromkeys(range(1000))' -- 'dict(d)'
.....................
Mean +- std dev: 208 us +- 1 us

$ pyperf timeit -s 'import collections; d = collections.UserDict.fromkeys(range(1000))' -- 'dict(d.data)'
.....................
Mean +- std dev: 18.8 us +- 0.2 us

See this example. If we serialize arbitrary Mapping, this example will not work.

class MyMapping(abc.Mapping):
    def __init__(self, dict):
        self._data = dict
    ...

def custom_serialize(obj):
    if isinsintance(obj, MyMapping):
        d = {"__type__": "MyMapping"}
        d.update(obj._data)
        return d
    ...

J = json.dumps(someobj, default=custom_serialize)

...

def custom_deserialize(d):
    if d.get("__type__") == "MyMapping":
        del d["__type__"]
        return MyMapping(d)
    return d

....

json.loads(J, object_hook=custom_deserialize)
1 Like

And I said that in 99% of custom Mapping, __iter__() returns iter(self._dict), so it can use PyDict_Iter anyway. It have just to check if tp_iter is not changed, as PyDict_Merge do.
And if tp_iter is changed, PyDict_Merge uses a slower method, but in C, not in Python.

Anyway, I repeat, dict.update(Mapping) works, even if tp_iter is changed. We have not to do dict.update(dict(Mapping)). And furthermore I’m not sure this is more fast…

Furthermore, there are Python implementations of json (de)serialization that are way more faster, as orjson. So it does not seems to me that speed is a priority for json.

I continue to not understand… obviously I can’t test your example, since Mapping is not supported by json… but you’re really saying that, for data types that json supports directly, custom (de)serialization can’t be done? It seems very strange to me.

No, PyDict_Merge can not use PyDict_Iter if other is not subclass of dict.

Remember, you referred the performance as reason for supporting Mapping in the first comment.
“Anyway, I repeat…”, and “speed is not a priority” does not make sense at all.
OK, stop about performance. It can not be a reason for support Mapping.

You can use default option to customize serialization of types which are not supported by json. Read the manual. There is enough examples.

You can not customize serialization of types json supports (e.g. str, int, list, dict) with the default because default call back is not called for them.

And this is IMHO wrong, but, as I said, even if the fast method is not used, it’s used the slow method, without forcing the coder to convert the object to a dict first.

Performance is not secondary, but the first reason is elegance and practicality. If json supports also collection.abc subclasses, it’s more elegant and practical to write json.dumps(Mapping) instead of json.dumps(dict(Mapping)). And it could be also fast as json.dumps(dict), if json will simply check if the object is a subclass of Mapping and tp_iter is equal dict_iter.

So your proposal is that json should not support any other type natively anymore, I suppose.

@Marco_Sulla please hold on and take a time for meditating on @methane answers.
He is completely correct, I’m 100% sharing his opinion.

If the meditation is not enough, I suggest making a patch with implementation of your proposal. I expect you’ll see a lot of failing tests but it can be the excellent exercise.
After getting tests green you can come up with your proposal again if you still want to get it done.

2 Likes

Ok, you’re right. He is completely correct, so please remove the support of dict.update(Mapping) and any other function or method that support directly Mapping, MutableMapping, Sequence and any other collections.abc subclasses without converting them previously to the respective builtin type, because they are not consistent with the json behavior.

Please don’t ask us, volunteers, to do some work.
Please don’t hesitate to make a prototype for demonstration of your awesome idea instead; the working example extremely helps with the idea defending.
The proven proposal weights much more than an abstract random thought, isn’t it?

1 Like

@asvetlov First of all I’m busy. Secondly, if I have time, I’ll spend to finish the implementation of frozendict. Thirdly, why should I implement something that you and @methane think is wrong?

If it’s wrong, please change all the other behavior of CPython with collections.abc classes accordingly to the json package.

Furthermore, maybe you have short memory, but I tried in past to contribute to your aiohttp project. And I don’t remember you was so sarcastic when I opened some bug report.

When a Mapping is not subclass of dict, how it can have tp_iter same to dict_iter?

I don’t propose anything. Please read your first post again. I just answered your question:

I think json can not support any other types natively by default.
But it can add opt-in option, like mapping_to_object and datetime_to_iso8601. I don’t have any opinion about it.

For the third time:

def __iter__(self):
    return iter(self._dict)

Anyway, as I already said many other times, it does not matter, since other functions in CPython, like PyDict_Merge(), do a slow but generic iteration to get the keys and values of a generic mapping. json can do it without any problem.

Ok. Why?

No, it’s tp_iter just calls your __iter__ method. (e.g. slot_tp_iter, not dict_iter).

If this is third time, I could not understand you because you misunderstood the tp_iter.

I already described it several times in this thread. It breaks backward compatibility. It makes user can not use default option to customize serialization of user types which are Mapping.

If you feel I am sarcastic, I’m sorry about it. I am not good at English so it is difficult to use right nuance. (I am not maintainer of aiohttp project anyway.)

I just wanted to answer your questions and ask questions I can not understand what you mean.

Oooook, but as I already said 4 times :smiley: that anyway this does not matter, since json could just use the “slow” method of PyDict_Merge().

I suppose it does not break nothing, on the contrary. Simply default will be ignored and the object will be serialized more fastly.

I was not replying to you, but to Svetlov, that it seems that have a very short memory.