Add BiMap - a bidirectional dictionary to collections module

Good question.

It can be done with 1 hash table:

class BiMap(dict):
    _r = None

    def __init__(self, inv=False):
        self.inv = inv

    @property
    def r(self):
        if self._r is None:
            self._r = BiMap(not self.inv)
            self._r.update(self)
            self._r._r = self
        return self._r

    def __setitem__(self, key, value):
        full_key = (self.inv, key)
        full_value = (not self.inv, value)
        super().__setitem__(full_key, value)
        super().__setitem__(full_value, key)

    def __getitem__(self, key):
        full_key = (self.inv, key)
        return super().__getitem__(full_key)


if __name__ == '__main__':
    d = BiMap()
    d['a'] = 1
    print(d['a'])
    print(d.r[1])
    print(d.r.r['a'])
    print(d.r.r.r[1])

I am sure one can come with other ways, custom hashtables, etc


But given standard practice is to use 2, I don’t think any of possibilities offer much beyond additional complexity. If you find anything interesting, please share. :slight_smile:

3 Likes

Ideally. If keys + values are distinct then you could store the reverse mapping in the same directory, but I tend not to:

vals = 'ABCDEFGHIJ'
keys = list(range(len(vals)))
# I do this
kvdict = {k: v for k, v in zip(keys, vals)}
vkdict = {v: k for k, v in zip(keys, vals)}
# I suppose one could use this:
bidict = {}
for k, v in zip(keys, vals):
    bidict[k] = v
    bidict[v] = k

See Introduction - bidict for some reasons you might not want to do it this way.

2 Likes

Thanks for the replies. I’m not thinking about a Python implementation. In Python, I think you need two dicts necessarily. But in CPython? Can one hashtable be used safely to store both keys and values?

You could implement the version that @dg-pb described in some other language, but I don’t think this gets you anything valuable.

1 Like

You can think of Python code as pseudocode. It largely applies to everything that can be done in CPython.

In CPython dict usage is pretty much the same.

There are some paths to go deeper, but that would be a rabbit hole and large amount of work with most likely little value. (Have a look at dict implementation to have a sense of work it would need to implement something custom without re-using existing dict).

I created an issue to add my recipe from above to the documentation: Add BidirectionalDict recipe to `collections` · Issue #128001 · python/cpython · GitHub

We’ll see how that goes. It’s simple, maintains insertion order and checks for duplicates.

Q: What is the rationale for not allowing duplicate values in bijective mappings? Usually a new item overwrites the previous one. (I understand that the key previously associated with the value must be removed)

What is the rationale for not allowing duplicate values in bijective mappings

If you have an implementation based on two dicts, you don’t want have the key->val dict containing key1->VAL and key2->VAL while the other contains VAL->key1 or (exclusively) VAL->key2.

Now, OOP-wise, I think a bidict should expose a “read” view in terms of “value to key” and a “read-write” interface for “key to value”. In other words, you should not be able to insert a new entry using its value. Sorry for being unclear when I said “no duplicated values”.

IMO BiMap is a completely symmetric data structure, I see no reason why the inverse mapping needs to be read-only.

One reason is: what would x.keys() be after a call x[k] = v? is v the key or is it k? For instance if we consider the loggingLevels example, using loggingLevels[name] = value and then loggingLevels.keys() should yield the level names, but not the values. If we were to allow values to be mapped to keys, then this would be via a separate method.

I wouldn’t mind this separate method but for a first iteration, I’d prefer sticking with something simple. Unfortunately, we rejected adding a recipe so I think users will need to use 3rd-party libs.

My expectation is:

fwd = bidict(a=1, b=2, c=3)
rev = fwd.reverse()
assert type(fwd) is type(rev) is bidict
assert rev.reverse() is fwd
rev[4] = 'd'
assert fwd.keys() == rev.values() == {'a', 'b', 'c', 'd'}
assert fwd.values() == rev.keys() == {1, 2, 3, 4}

And I’d like to add an example related to my question about duplicates.
What should be the outcome and why?

bi = bidict(a=1, b=2, c=3)
bi['a'] = 3

# option 1:
assert bi == {'a': 3, 'b':2}  # 'c':3 is auto-deleted

# option 2: Some sort of  <Duplicate_value_exception> raised
1 Like

See: Basic Usage - bidict

This could result in surprises or problems down the line.
Errors should never pass silently. Unless explicitly silenced.

If you want to overwrite the key, you can do this instead:

bi.reverse[3] = "a"

Thanks for the link. Well, a duplicate value error is what most people here want and expect. The implication is that updating requires special attention. Imagine bi = bidict(a=1, b=2, ...) that should be modified to bidict(a=2, b=1, ...).

bi['a'] = 2 # error (clash with 'b':2)
bi.reverse[2] = 'a' # error (clash with 1:'a')

The linked article shows a forced update, etc. but the added complexity and again several options how to deal with it persuaded me that the proper place for a bi-dictionary implementation is a separate module in PyPi.

A frozen bidict - which does not have any problems with duplicities and updates - is still quite a helpful data structure. But we have that as stated immediately at the discussion’s beginning:

{v: k for k, v in mydict.items()}

@jab, would a recipe be useful to add in the documentation of bidict?

This is how I solved this one for myself.

I have 2 objects:

  1. OneToOne
  2. ManyToMany

OneToOne is strictly Bijection - Wikipedia. Duplicate keys/values raise an error.

ManyToMany doesn’t have this restriction. It is a graph-like in a sense that one key can lead to many values. And its operations are edge-like.

Both above have forward and inverse mappings read & write.

1 Like

I’d guess delete operations in the many to many case would need some thought?

In the wikipedia example of the many-to-many, bidirectional mapping between books and authors, you can add books with many authors and authors with many books. On deleting books, you could update Authors to eventually have no books. You could then delete the author, but you would also loose the information that they are an Author. I guess if you are allowed to add books with no authors and Authors without books then you need to be able to keep Authors without books on deletion of their last book, and vice-versa. You would presumably be able to add Authors to existing books and books to existing Authors, incrementally too?

  • I don’t need it now, so I’ve no code. But it is interesting
 :slight_smile:

I think more simple way to think about it is:

  1. You have 2 sets of vertices: A & B
  2. And you can have edges from a in A to b in B

Bottom right:


The rest is features and implementation detail.

1 Like