Update secrets examples to use 64 bytes (512 bits) and not 16 bytes (128 bits)

In the secrets documentation, the examples of creating a secret all use 16 bytes, and only after the API documentation did I see " How many bytes should tokens use? 32+" This did bite me in the butt because I had created the key for my ASP.NET production environment in August using Python and this token_hex example. I just upgraded my NuGet packages to 7.X from 6.X and I got an error that key sizes greater than 256 bits are required. I’m glad that the app I am building is still in testing phase, but if this app had millions of users, a key change like this would result in all of them having to relogin.

I propose that these examples are updated to use 48 bytes or more so that people like me who are just interested in having a secret key for their backend’s auth system are not going to be using outdated examples. I want to trust that the examples given are not going to cause problems down the road. I am glad that microsoft decided to start enforcing on 256 bits as that is what security is about, but I doubt Flask’s session or any other token creation code on Python has such checks. This is why it is imperative to update these examples.

Alternative suggestion: Keep the existing examples but add others that don’t pass any size whatsoever. Part of the point of the secrets module is that the defaults are intended to be secure, and may very well change. At the moment, for example:

>>> secrets.token_hex()
'153243500f30d40a07fe4225e4dbe34b0d6981ea1d6366a8e347f173aeb291a4'
>>> secrets.token_hex(32)
'101debad4ad6b28d2aa9bf57f109615fc8f9f12054e9a6184b13f0cb7b225dde'

but in the future, token_hex() might return an even longer token. (See secrets — Generate secure random numbers for managing secrets — Python 3.12.0 documentation)

IMO it’s good to have examples showing explicit entropy selection, but in general, people should be encouraged to let it use the default unless there’s a good reason not to. Even if you need to maintain internal compatibility, I would still recommend checking what the default is.

Side note: Is secrets.DEFAULT_ENTROPY meant to be private or public? It doesn’t have an underscore prefix, but it isn’t documented.

Did you read the discussion in secrets — Generate secure random numbers for managing secrets — Python 3.12.0 documentation “ How many bytes should tokens use?” that talks about this issue?

That seems to be a suitable warning to not assume defaults are necessarily fit for any specific purpose.

On the contrary, the intention is that the default should be suitable for most purposes, but is unstable. Specifying the exact length is an option if you need stability or compatibility, but if you use it to give an increase in security, it’s on you to ensure that you keep on lengthening it.

Indeed, and the module’s PEP (506) also covered this:

One difficult question is “How many bytes should my token be?”. We can help with this question by providing a default amount of entropy for the “token_*” functions. If the nbytes argument is None or not given, the default entropy will be used. This default value should be large enough to be expected to be secure for medium-security uses, but is expected to change in the future, possibly even in a maintenance release [14].

The docs say:

As of 2015, it is believed that 32 bytes (256 bits) of randomness is sufficient for the typical use-case expected for the secrets module.

The next question: as of 2023, is a higher value sufficient? The module was created in 2015 and the value hasn’t yet changed.

My understanding: if it’s not documented and not in __all__, then consider it private (regardless of underscore prefix).

This is neither, so best to play safe and consider it private.

The default is currently 32 in the version of Python I’m running. That part is fine. What’s an issue, though, is that the examples ONLY show with a parameter being given, and that parameter is 16. Given the module’s purpose, I would be inclined to either remove the parameters or at least add further examples without them, perhaps with a code comment alongside them saying that the result will be at least this long.

I’m not sure of the policy on linking to external websites in the python docs, but would it be worth linking to something like Keylength - Compare all Methods here?

From @rhettinger two weeks ago:

For a session token, the current recommendation will likely always be sufficient. Even bitcoin uses SHA-256.

The repr length of a 16-byte random value is 50 characters in average. For 32 bytes it is 95 characters and for 64 bytes it is 187 characters.

>>> (len(repr(bytes(range(256))))-3)/256*16+3
48.9375
>>> (len(repr(bytes(range(256))))-3)/256*32+3
94.875
>>> (len(repr(bytes(range(256))))-3)/256*64+3
186.75

Even 95 characters is too much to display (think how it would look in PDF), let alone 187. This is the reason for using 16 bytes in the examples.

2 Likes

Perhaps the examples should use an ellipsis, e.g.

>>> token_bytes()
b'\xebr\x17D*t\xae\xd4\xe3S\xb6\xe2\xebP1\x8b...'
2 Likes

I like that idea. Do you think anyone would be confused by it?

# doctest: +ELLIPSIS is pretty standard in docstrings, and I’ve never had complaints about docs that use it.