Python lacks a built-in bidirectional dictionary implementation.
While traditional dictionaries provide O(1) lookup by key, many use cases require efficient lookup by both key and value.
Currently, developers either maintain two synchronized dictionaries.
The collections module, which already provides specialized container datatypes, is the natural place to add this functionality.
This implementation follows the same concept as boost::bimap in C++ and BiMap in Scala.
While there are third-party Python packages providing similar functionality (like bidict), having this as part of the standard collections module would provide a standard, efficient solution for this use case.
Proposed Solution:
from collections import BiMap
# Create a bidirectional mapping for users
users = BiMap({
1: "john_doe",
2: "alice_smith",
3: "bob_wilson"
})
# Key lookup (O(1))
users[1] # "john_doe"
# Value lookup (O(1))
users.get_by_value("john_doe") # 1
# Adding a new user - automatically handles both directions
users[4] = "emma_brown"
# Check if username exists
"john_smith" in users.values # True
# Check if ID exists
1 in users # True
# Get all usernames or IDs
users.values() # View of all usernames
users.keys() # View of all IDs
Did you use the existing third party package in those projects?
Most projects will use some third party packages. The fact that third party packages can easily be installed means that there should be a strong argument for why something should be included in the stdlib. I doubt that there is a strong enough argument for BiMap which many people will never have used and which is also not hard to implement on top of dict anyway. For many purposes this is enough:
reverse_dict = {v: k for k, v in some_dict.items()}
I personally use bidirectional dicts quite frequently, but have always simply worked around it by creating reverse mappings as @oscarbenjamin already pointed out.
A readily available bidirectional dict is something I would definitely take advantage of if it were in the standard library but is something I would not bother to add as an external dependency. Including it in the stdlib would be an easy code clutter reducer with a minimal effort IMHO.
A quick search of "v: k for k, v" in GitHub yields 102k results, showing that this is indeed a common pattern.
And here’s an example of where it can be used in CPython:
If the keys are all strings and the dict is not going to be altered after definition, you can currently make it an Enum instead (your example code does alter the dict after definition though so I know this won’t fully work for you):
from enum import IntEnum
class Users(IntEnum):
john_doe = 1
alice_smith = 2
bob_wilson = 3
Users(1).name # 'john_doe'
Users['john_doe'] # 1
'john_doe' in Users.__members__ # True
1 in Users # True
Right, I’ve done/seen that. But I think never with having to update them in sync afterwards, i.e., what the proposal would offer.
Is it? If I don’t misread it, the original and the reverse ones are not kept in sync afterwards, see the different updates in the “Non-mutual mappings” section.
Right, it is easy to create a reversed_map using {v: k for k, v in original_map.items()},
but maintaining it is less convenient. You have to update both maps with every change.
I think {v: k for k, v in original_map.items()} is very useful because, in cases where a reversed map is needed temporarily, developers often prefer to use this one-liner instead of adding a third-party dependency.
If this functionality were included in the standard library, it would be more convenient.
It also feels intuitive to include it in the collections module, which already contains many specialized dictionary types
Actually, Scala’s BiMap is not a native type but a 3rd-party type (from Guava or Apache probably). AFAIK, other languages such as Rust, TypeScript or Perl do not support bidirectional mappings natively. So, I’m really not in favor of adding this pattern to the standard library.
Note that in order to have a bidirectional map, we need to restrict to hashable values (if you want O(1) lookup) and handle duplicated values as well. I suspect we’ll likely have just two internal mappings that do the trick and I don’t think it’s more efficient than having two independent dicts that you’ll update simultaneously (you can wrap that in a Mapping-interface with 2 internal dict attributes for instance).
It also feels intuitive to include it in the collections module, which already contains many specialized dictionary types
While they contain specialized dictionary types, they are here for efficiency and are mostly legacy content. So I doubt we’ll want to have more data structures like this.
If you want to add such type, I think a PEP is likely needed. Or we can have some recipe page for that (we have recipes for itertools, but having a recipe for collections could perhaps be a good start).
If it belonged in the standard library at all, it would be included in the collections module. The argument for inclusion in the standard library at all has become harder to make since the Internet, and particularly PyPI, has become more widely available since the early days of Python.
class BidirectionalDict(UserDict):
def __init__(self, *args, **kwargs):
self.inverse_data = {}
super().__init__(*args, **kwargs)
def __setitem__(self, key, value):
if key in self.data:
if value == (old_value := self.data[key]):
return
del self.inverse_data[old_value]
if value in self.inverse_data:
raise ValueError("Duplicate value")
self.data[key] = value
self.inverse_data[value] = key
def __delitem__(self, key):
value = self.data[key]
del self.data[key]
del self.inverse_data[value]
@property
def inverse(self):
inverse_bidict = BidirectionalDict()
inverse_bidict.data = self.inverse_data
inverse_bidict.inverse_data = self.data
return inverse_bidict
Essentially. Possibly with a covariant return for inverse but that’s the idea. We already include a lot of one-liners for itertools and deliberately don’t want to add them to itertools itself so we can do the same for collections.
I’d welcome anyone who’s started writing their own code that implements a bidirectional mapping to read my code, try it out, and let me know how well it works for you.
-Josh
[1] Please feel free to @-mention me in the future! Wouldn’t want to miss any threads here that mention bidict.
I’d personally love to see a bidict available in builtins, never mind just the standard library - I’d consider it as important and as universally-relevant as dict, list and so on. I use the {v: k for k, v in d.items()} construct in numerous places. Usually my use cases are for read-only data but there have been places where I’ve defined a class with some basic functionality for a pair of dicts that can be mutated and kept in sync, so I’d think it’s worth having mutating functionality in there and perhaps leaving the concept of a frozenbidict to be filled in by a third-party package like is currently the case for frozendict. (Strictly speaking, I’d actually argue that frozendict and frozenbidict ought to be in the stdlib too, but having bidict in stdlib is far more important IMO.)
There isn’t any way around this if you want efficient lookup in either direction.
If there were a better solution, then printed foreign language dictionaries wouldn’t have both an English-to-French section and a French-to-English section in the same book.