Handling duplicate keys in JSON

I’m trying to see any reasonable way to handle duplicate keys in JSON. It was surprising to me, but apparently the JSON specification doesn’t disallow this. And certain tools (e.g. HTTPie) may encounter such JSON that they can’t reasonably ignore. (For an example scenario, see this HTTPie feature request I raised.)

It is easy to parse. We can use the multidict package and pass object_pairs_hook=MultiDict.

However, I can’t see any reasonable way to serialize it. We can’t implement JSONEncoder.default because it has to return a supported basic type, and converting to a dict would obviously throw away the duplicate keys. And there appears to be strong resistance* to making the json module support Mapping in general in place of dict.

It looks to me like there’s no reasonable way to successfully format a JSON object with duplicate keys without writing a complete JSON formatter from scratch. Am I wrong?

* - Apparently I am not allowed to include more than two links. This was intended to be a link to the conversation “Json does not support Mapping and MutableMapping” on this site, with path t/json-does-not-support-mapping-and-mutablemapping/2829

I think the short answer is no you cannot. There is a package on PyPi for it though. There is also a solution for reading them in, not writing out, in this SO post.

Just because the JSON spec allows duplicate keys, does not mean it’s a good idea. Is this a customer request and you just have to deal with the odd file they produced?

Sometimes we got odd files from the customer and sometimes we have to protect them and say “We cannot use this. Do this instead.” Part of our job is protecting them from bad data habits. We can only polish a turd so far to make it shiny.