Add to random.shuffle, allowing shuffle of dict key/value pairs

xxxxme · September 8, 2024, 6:25pm

random.shuffle currently does not work on dictionaries (unless the keys happen to be integers in a range starting at 0):

>>> import random
>>> a = {0: 'dead parrot', 1: 'cat named Eric', 2: 'inverted piglet'}
>>> random.shuffle(a)
>>> print(a)
{0: 'cat named Eric', 1: 'dead parrot', 2: 'inverted piglet'}
>>> b = {'parrot': 'dead', 'cat': 'named Eric', 'piglet': 'inverted'}
>>> random.shuffle(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.12/random.py", line 357, in shuffle
    x[i], x[j] = x[j], x[i]
                 ~^^^
KeyError: 2

The idea is to modify random.shuffle to shuffle dictionary’s key/value pairs even if the keys don’t happen to be a range of integers.
An example program taking advantage of this:

#!/usr/bin/env python

# Read a letter-substitution cipher using your secret code!

alpha = {'A': 'A', 'B': 'B', 'C': 'C', 'D': 'D', 'E': 'E', 'F': 'F', 'G': 'G', 'H': 'H', 'I': 'I', 'J': 'J', 'K': 'K', 'L': 'L', 'M': 'M', 'N': 'N', 'O': 'O', 'P': 'P', 'Q': 'Q', 'R': 'R', 'S': 'S', 'T': 'T', 'U': 'U', 'V': 'V', 'W': 'W', 'X': 'X', 'Y': 'Y', 'Z': 'Z'}

enciphered = input('Enter the secret message to decode: ')
seed = input('Enter your secret code: ')
print('Secret decoder ring activated!')

random.Random(seed).shuffle(alpha)
plain = ''
for letter in enciphered:
   plain += alpha[letter]
print('It says:', plain)

Edit: Fixed a typo in the code.

jamestwebber · September 8, 2024, 6:54pm

random.shuffle is documented to work on sequences. I think the fact that it works on dictionaries at all is just a quirk of the implementation^[1], not intended behavior.

specifically: it’s swapping pairs of ~~keys~~ indices based on the length of the input, so a dictionary with sequential integer keys happens to work ↩︎

barry-scott · September 8, 2024, 6:55pm

The purpose of random.shuffle is to change the order of a list, in place.
What would that mean for a dict?

jamestwebber · September 8, 2024, 6:58pm

Also, the desired behavior (swapping around the keys and values of a dictionary) can be implemented with random.sample

shuffled_dict = dict(zip(random.sample(d.keys(), k=len(d)), d.values()))

tjreedy · September 8, 2024, 7:44pm

The ' at the end should be removed.

I was initially surprised that dict a worked, but LOL, it simulates a sequence of length 3 by having keys in range(3). The ability of dicts to simulate a sequence is used in practice to implement sparse arrays, where only a few indexes in a large range have non-default values.

xxxxme · September 8, 2024, 10:21pm

In my mind, it should treat a dict as a list that has much more freedom as to what can be an index and keep the keys in place (or not, it doesn’t matter) but shuffle the values, which is what happens with a list.

chepner · September 8, 2024, 10:44pm

I would sooner expect random.shuffle to just change the iteration order of a dict than actually break the existing connections between the keys and their values. It might be worth having a dedicated random.shuffle_mapping function that does what you want, but I don’t think overloading random.shuffle is the right thing to do.

Rosuav · September 8, 2024, 11:26pm

You can get the keys and values (or items) of the dictionary, shuffle, and then construct a dictionary from that.

bschubert · September 9, 2024, 1:35am

Not Available:

But there are problems here as in it depends on how python deals with order. I did a test and the dictionary it produces is not in the order I gave it:
>>> import pprint
>>> 
>>> mydict = {
...   "first" : 1, 
...   "second" : 11, 
...   "third" : 111,
...   "forth": 1111,
...   "I take the fifth": 11111}
It seems to be alphabetical order with upper case first!

That’s because pprint sorts dictionary keys by default. You can pass sort_dicts=False or use pprint.pp instead of pprint.pprint to preserve the default order.

Paddy3118 · September 9, 2024, 10:28am

Just thinking of what shuffle should do to dicts, I think that if the dict is ordered then it should shuffle the key order, but keep the mapping. If the dict is not ordered then it should give an error.

Rosuav · September 9, 2024, 12:00pm

Regular dictionaries retain their insertion order. collections.OrderedDict have an order to them that can be manipulated (notably, you can move any key to the end). You could, for example, Fisher-Yates an OrderedDict; but I don’t really see the point, when you could simply take the .items(), shuffle that list, and then reconstruct the dictionary.

Rosuav · September 9, 2024, 12:00pm

Irrelevant if you’re shuffling a brand new object.

avi.gross · September 9, 2024, 3:45pm

@Rosuav may want to clarify the above.

In languages using immutable objects, I believe that there is an exception at object creation time in that initializing built-in objects often is done in a compact way without multiple versions being copied as you go.

But are you suggesting random.shuffle falls into this category by being called after an object has been created? I am not clear on how the function operates by changing the object, such as a list, internally. But like you, I am not convinced that an internal change is always the best way to go.

Just for a moment, consider a need to sort the contents of something like a list or dictionary. There are literally hundreds of sort algorithms out there and some, like a merge-sort work by taking subsets recursively. Many of these algorithms are made quite efficient if operating on a particular data structure.

But if you want to sort the contents of some other structure and the only hooks are to extract an arbitrary member or add a member to the end, you may be able to do it but might well do it much better as you described by extracting the contents, sorting them the usual external way, and then rebuilding it.

With an immutable object, that may be the main or only way to do it. With mutable objects, there may be a choice.

In playing around with the code, I have a dumb question. Why does random.shuffle normally return None when returning the (changed) object itself might be useful?

pf_moore · September 9, 2024, 4:16pm

That is a very common question which has been answered multiple times. Basically, though it’s simply because that is the convention that has been used in Python for many years (the reasons it is this way are valid, but at this point “because that’s the convention” is probably at least as important if not more).

There’s probably a bunch of background research it’s worth doing to get a better feel for the design principles here, if you’re not aware of this convention. (Sorry, I don’t have time right now to find some useful links for you, but a search of the various Python forums and mailing lists should get some).

brass75 · September 9, 2024, 6:14pm

I think this will get you what you want:

>>> alpha = {'A': 'A', 'B': 'B', 'C': 'C', 'D': 'D', 'E': 'E', 'F': 'F', 'G': 'G', 'H': 'H', 'I': 'I', 'J': 'J', 'K': 'K', 'L': 'L', 'M': 'M', 'N': 'N', 'O': 'O', 'P': 'P', 'Q': 'Q', 'R': 'R', 'S': 'S', 'T': 'T', 'U': 'U', 'V': 'V', 'W': 'W', 'X': 'X', 'Y': 'Y', 'Z': 'Z'}
>>> {k: v for k in random.sample(list(alpha.keys()), len(alpha)) for v in alpha.values()}
{'G': 'Z', 'F': 'Z', 'L': 'Z', 'J': 'Z', 'T': 'Z', 'N': 'Z', 'H': 'Z', 'P': 'Z', 'I': 'Z', 'S': 'Z', 'M': 'Z', 'K': 'Z', 'U': 'Z', 'E': 'Z', 'B': 'Z', 'X': 'Z', 'O': 'Z', 'V': 'Z', 'C': 'Z', 'W': 'Z', 'Z': 'Z', 'Y': 'Z', 'R': 'Z', 'A': 'Z', 'D': 'Z', 'Q': 'Z'}

Since this can be done in a 1 line comprehension I don’t think there’s really any need to change the functionality in random (either by adding a new function or modifying the existing one) for a use case that is probably not that common - breaking the link between key and value in a dictionary.

tjreedy · September 9, 2024, 6:17pm

Prevention of bugs from users not learning (yet) or forgetting that the function is a procedure, a function with only side-effects, that creates no new objects. As it is, beginners often have trouble with aliases that they create themselves. Common question given someting like the following:

a = [1,2]
b = a
b[0] = 3

Why has a changed? Or even “I discovered a bug!”

jamestwebber · September 9, 2024, 6:24pm

Your answer returns 'Z' for every single value, which presumably isn’t desired (very difficult cipher to crack, though).

But there are already multiple solutions in this thread for the OP.

brass75 · September 9, 2024, 6:39pm

Yeah… I didn’t notice that. Oops.

oscarbenjamin · September 9, 2024, 6:57pm

It could have always been possible though to call the function shuffled though and have it take an iterable, return a new list, and not modify the original. I think that is almost always the function that I would have preferred to be able to use rather than shuffling a list in place.

If you had that then the answer here for the OP would be:

d2 = dict(zip(d.keys(), shuffled(d.values())))

jamestwebber · September 9, 2024, 7:01pm

See for example this thread about that idea.

It’s basically an alias for random.sample(input, k=len(input)) and it seems like the previous response was that an alias isn’t that useful for something that simple.