I recently came across a use case where I needed to created nested keys, basically it was where I needed convert a web form like datastructure with keys like this “customer[profile][name]” into a deeply nested JSON structure.
In the end it was easier to pull in a specific library to handle this rather than hand roll the nesting logic.
But it would have been great to have the same os.path.mkdirs type API but for dictionaries, in an ideal world the API would look something like this
tree = {}
tree.set(('customer', 'id', 'name'), {})
This would give us a dict that would look like this
tree = {
'id': {
'name': {
}
}
}
This would throw an exception if trying to access a nested key that is set to anything other than a dict and isn’t a leaf node i.e if “profile” was an array or string
The recursive defaultdict above should probably work fine for most cases where you control the creation of the objects in question.
Even if you don’t though, I’m not sure this belongs in the standard library (I am in favor of the std library’s json module gaining jsonpath support as functions but not as methods on dicts) as you’ve got a few requirements here that likely don’t apply broadly, namely:
Which would prevent the use of such a standard library inclusion to update existing data.
As it stands, for the nested dict-only case you have, the below should work, you can tweak it to also error on the last line if d[last] exists prior to setting it.
def nested_set(d: dict[str, Any], path: tuple[str, ...], value: Any):
if not path:
return ValueError("some message about needing to provide keys here")
*most, last = path
for k in most:
d = d.setdefault(k, {})
d[last] = value
Similar to @blhsing 's solution, you could also use the __missing__ method, if you want to maintain a dict-like repr, and add some of your own utility methods, e.g. throwing error on attempting get on non-existent keys:
from functools import reduce
from operator import getitem
def _get_if_exists(tree, k):
if k in tree:
return tree[k]
raise KeyError(k)
class nesteddict(dict):
def __missing__(self, key):
return self.setdefault(key, nesteddict())
def deepset(self, *keys, value):
*path, key = keys
reduce(getitem, path, self)[key] = value
def deepget(self, *keys):
return reduce(_get_if_exists, keys, self)
d = nesteddict()
d["a"]["b"] = 5 # same as d.deepset("a", "b", value=5)
print(d.deepget("a", "b")) # 5
d.deepset("a", "b", value=6)
print(d.deepget("a", "b")) # 6
print(d) # {'a': {'b': 6}}
print(d.deepget("a", "c")) # KeyError: 'c'
If you want to deepset only for existing paths, you could reuse deepget inside deepset.
What I have to say is: this kind of thing is complicated. It has edge cases. Tons of them. And it is hard to use - as people might have different ideas of what should be the best approach.
As faras I know there is no preferred library or idiom for this.
Using my code, you can do:
from extradict import NestedData
>>> tree = NestedData({"customer.id.name": "James"})
>>> tree
{'customer': {'id': {'name': <str>}}}
>>> tree.data
{'customer': {'id': {'name': 'James'}}}
The “tree.data” attribute is a regular dictionary - NestData does some dynamic wrapping if it is used directly.
Feel free to interact with it, and find what could improve in ergonomy.
So, for the the time being we can build a reasonable consensus on that could be better approach to this feature. (and we may find that the better thing is to keep this kind of functionality in 3rd party libs),
that is certainly not easy to type or to read - a workaround, but certainly not an argument for not saying the O.P. described feature wouldn’t be welcome.
See flatten_mapping and unflatten_mapping, which flattens a nested dictionary into {(customer, id, name): 'James', and back. This allows similar access patterns as requested.