Expose `hashlib` and `hmac` as command-line utilities [withdrawn proposal]

I decided to withdraw this proposal with reasons explained in Expose `hashlib` and `hmac` as command-line utilities [withdrawn proposal] - #6 by picnixz.


The modules with a CLI are listed on Modules command-line interface (CLI) — Python 3.14.2 documentation. Among them we have base64 and encodings.rot_13 (which I wasn’t aware of until today). base64 is also available as a Unix command so we’re happy, and rot_13 is likely unused by most people. Now, offering base64 as a CLI tool is actually good on non-Unix systems such as Windows as they usually lack such built-in features.

We recently added a CLI for random and I would like to the same for hashlib and hmac. The tool would expose the following features:

  • Indicate which algorithms are available for the given interpreter (which is something that users may not know about it), and possibly which implementation is being used (HACL* or OpenSSL) though this may require a bit more work as I don’t think hashlib objects retain their original implementation (except via their base class).
  • Compute the digest/MAC of a given string, file, or stream (thus, avoiding the needs of echo text | <command> or <command> <<< string in general as well as making it supported on all platforms).
  • Compare digests/MAC of two strings, files or streams.

One could say that the openssl command-line already provides this, but OpenSSL commands are annoying to remember from one invokation to another (and the output may need sanitization as it contains other characters than the pure digest). On the other hand, OpenSSL’s BLAKE2s/2b implementation provides less configuration than what Python’s able to offer.

Likewise, it’s possible to compare checksums via sha256 -c file1 file2 or filecmp in Python (which does not compute checksums by the way).

My main motivation is therefore to expose a unified entrypoint for common cryptographic operations.


As a maintainer of hashlib, I could have pushed for this functionality without opening a thread here, but I wanted to hear from the community first as the intended audience is different.

9 Likes

I like the idea in general. It’s a useful and handy utility.

Two details to think about:

  • bytes/strings duality… let’s say that hashlib.sha256 is exposed… this will receive bytes: cat foo.mp3 | python -m hashlib.sha256, while the following will receive a string: ptyhon -m hashlib.sha256 foob123 … how do you plan to deal with this?

  • if your idea is to offer “extended functionaly” (like comparing the hash of two files) beyond simple hashing (as in my example before), you should detail all the proposed functionalities, otherwise it’s hard to evaluate… if it’s not trivial maybe a PEP would be the best format

Thanks!!

1 Like

At least a feedback (and thank you for this!) I was expecting people arguing about that proposal a bit more!

hashlib.sha256 foob123 would be equivalent to hash a file named foob123. In order to pass a string, I think adding a -c flag to indicate that this is a string would be better (like we do for python -c content vs python file.py. It’s also better than having a pipe redirection (I still want to stress that I’m not reinventing a CLI; all of this can already be achieved on Unix via various commands and post-processing). Now, I don’t know if people are more likely to hash a string or a file so we can decide on whether hashlib.sha256 input treats input as a file or a string and whether we have a -c option to indicate a string or an -f option to indicate a file.

if your idea is to offer “extended functionaly” (like comparing the hash of two files) beyond simple hashing (as in my example before), you should detail all the proposed functionalities, otherwise it’s hard to evaluate… if it’s not trivial maybe a PEP would be the best format

Computing the hash of a file is already exposed by hashlib as hashlib.file_digest and comparing the digests is exposed by hmac via hmac.compare_digest so I was just planning to compose those two functions together. Apart from listing the available algorithms and their implementations, as well as computing digests or MACs (that is, exposing hashlib.<algorithm>(data).hexdigest() and hmac.digest(data, key, algorithm).hex() as a CLI with key being given from a file but possibly directly at the CLI level though it is not recommended but can be useful for fast debugging), I don’t think I will expose more (we don’t even provide a generic signature algorithm and I don’t think I want to expose more for now).

For computing digests, I can offer multiple output formats:

  • bytes so that it can be redirected to a file; typically useful when you want to create some test file (though I don’t think it’s really useful by default)
  • hexadecimal format with possible grouping options
  • maybe coreutils compatible format? (maybe not as a first iteration)

If we’re going to keep pushing on with these module CLIs then I’d really like to get an answer to the question of where will they stop?

These two justifications are applicable to the majority of UNIX commands – almost every well known UNIX CLI is unavailable on vanilla Windows and there will inevitably always be a previously added module CLI to justify adding the next one. And there will also always be at least one person who thinks it could be useful so limiting this proliferation to what’s “useful” isn’t going to be much of a filter.

All such proposals also have to address the same counterarguments of:

  • Why not use a cross platform shell if you want a cross platform shell
  • Why not use python -c
  • Why not use python -m fire module
  • When is the python command really that much more universally available than these standard UNIX commands anyway[1]

With that in mind, any “Let’s add a CLI for x” proposal is synonymous with “Let’s re-implement our own Python version of coreutils + anything else that’s reasonably standard and put it in the standard library for people who want a cross platform shell but won’t use a cross platform shell”. If that’s what we really want then let’s make that the proposal[2]. If that’s not where we want to end up then that should be an automatic no to python -m hashlib or python -m shutil or any of the other previous proposals along these lines .


  1. It’s not CI/CD since any usable Windows CI runner will have git for Windows preinstalled which provides bash and coreutils. It’s not an end user’s machine since they likely don’t have Python at all. And it’s not a developer’s machine since the allegedly portable python command has a pretty good chance of really being python3 or python3.x or something involving uv run or conda exec. ↩︎

  2. and design these CLIs as if they were one unified idea, addressing problems like how to people discover these easter egg CLIs. Maybe even consider moving them into one python -m shell module so that they’re not wasting initialisation time ↩︎

I think people often don’t realize that it’s actually more reliably available on Windows than assuming a Unix-like system has the command, and that the command existing means it has the semantics you expect.

Powershell example below

[Text.Encoding]::Utf8.GetString([Convert]::FromBase64String('UG93ZXJzaGVsbA=='))

Is it expected that anyone would use the further configuration possible from a CLI? Some of the things blake2 is capable of don’t seem like the kind of things I’d expect a cli hashing tool to expose to users, I’d expect an application needing those features to wrap the one correct application-specific way to use it.

Part of the reason PGP is so error prone is the amount of functionality it exposes that users shouldnt be responsible for configuring. Part of the reason people can’t remember the specific invocations of openssl that might do what they want is exposing a complex library as a cli. this seems no different to me.

After thinking about this proposal a bit more, and by gathering your feedback (thank you), I think I will actually reject my own proposal. One major reason is that, unless I’m doing CTFs or other “quick” tasks, I indeed don’t see a real need for a sanitized output without filenames when I can already do it with existing tools. Sure there are times when it’s annoying but I would likely spend the same time typing python -m ... or echo text | ....

There is definitely value in having a CLI for Windows for, e.g., computing file digests without having to use unreadable calls but as I personally won’t be the one affected, it becomes less convincing to myself.

Finally, it was historically proposed (I actually forgot to look at the issue tracker this time) in hashlib command line interface · Issue #70675 · python/cpython · GitHub but, despite many core devs supporting this addition, it was eventually decided to reject it. Quoting Guido:

I prefer not to go down this road. The modules that do this where I use it
are typically Python specific, e.g. pdb or timeit. In the past we sometimes
had little main() functions in modules for testing but I think we have
better ways to test modules these days.

After reading that thread I also become less convinced. So thank you all for the valuable feedback and sorry for having taken your time for a dead proposal (I also don’t think I would support future CLI proposal for those specific modules).


Even if I rejected my own prposal, I will reply to some of your questions:

  • Why not use a cross platform shell if you want a cross platform shell

My main concern (and me being annoyed with existing tools) is that the output sometimes need sanitization before I can directly use it. So using a cross platform shell won’t help if the tools themselves have not the best output.

  • Why not use python -c

Computing file digests with python -c is honestly painful (you can’t just pass filepaths, you’ll need to pass a file object). And the function for comparing file digests is a combination of two functions, (hashlib.file_digest and hmac.compare_digest).

So, I wouldn’t recommend doing it and would rather suggest creating a real file script which is also annoying. My proposal was about skipping that second annoying thing.

For HMAC, it should work because we have hmac.digest which computes the digest (though only bytes…) but not for the other functions as you either get a HMAC or a HASH object and not a hexadecimal readable string. Unless you can call a method on the object you constructed via fire (I don’t know if it’s possible), it won’t work (that is, unless you can do hashlib.new(algorithm, data).hexdigest()).

any “Let’s add a CLI for x” proposal is synonymous with “Let’s re-implement our own Python version of coreutils + anything else that’s reasonably standard and put it in the standard library for people who want a cross platform shell but won’t use a cross platform shell”.

I disagree here but I don’t think I will be able to convince you here. I think it’s also fine to offer similar but centralized tools whose implementation solely depends on Python itself. There are also some modules that do need a CLI IMO (e.g., ast or dis) because they are very Python specific as Guido mentioned.

Powershell example below

If this is the alternative to python -m base64, then I wouldn’t want it and I am happy that python -m base64 exists.

Is it expected that anyone would use the further configuration possible from a CLI.

Honestly, no. Which is one of the reason that made me hesitate in the end.

1 Like