Nicer interface for str.translate

kknechtel · July 28, 2023, 10:33am

The .translate method gets around on GitHub. Certainly some of those are false positives, but the first results I saw were dominated by what appear to be real uses of str.translate. It’s certainly not as popular as .title, but it’s way more common than, say, .removeprefix or .swapcase, and possibly more popular than e.g. .isalpha. I’d call that pretty significant.

In my own experience, I commonly get halfway through suggesting str.translate to beginners for their toy cipher projects and such before remembering how much I have to explain after that point. It also comes up organically when discussing the general topic of sequence-processing algorithms: I might discuss one-to-one mapping and filtering with a list comprehension, then mention one-to-many mapping and how to get a flat result, then how comprehensions don’t care about the input type but dictate an output type… and then str has special machinery, right there, but it’s not so pleasant to explain.

_{Interestingly enough, there is also a bytes.translate and bytes.maketrans, which have equally annoying interfaces that are also obnoxiously inconsistent with the str ones. The mapping for bytes specifically needs to be a 256-byte length bytes (or bytearray, memoryview etc.; but the built-in help doesn’t say that! Or None to specify only removals, but it doesn’t say that either!), and there’s a separate argument for values to remove. A dict with integer keys (it makes sense this time!) isn’t acceptable; even a list or tuple isn’t either. Input bytes can be mapped one-to-one or removed, but not mapped one-to-many. bytes.maketrans can only take exactly two arguments, giving the one-to-one mapping.}

I don’t really know how to “advertise” my work, but I’d certainly be interested in publishing such a utility. (I might want to support corresponding algorithms for other sequences, too.)

I absolutely agree that the existing implementations can’t be removed.