Summary: I would like to propose an extension to the .replace() method to allow multiple substring replacements in a string using a dictionary. Currently, .replace() accepts only two arguments (the value to be replaced and the replacement value), which results in the need for multiple calls to replace different characters or words. With this new functionality, it would be possible to perform all replacements in a single call, making the code more concise.
Examples
Current Situation: To replace multiple substrings in a string, we need to make multiple calls to .replace():
a = "hello"
a = a.replace("l", "X").replace("h", "X")
print(a) # Output: XeXXo
Proposed New Usage: I would like to suggest a new way to use .replace() that allows passing a dictionary, where the keys are the values to be replaced, and the values are the new values:
a = "hello"
a = a.replace({"l": "X", "h": "X"})
print(a) # Output: XeXXo
Alternative Suggestion
If modifying the .replace() method is not feasible, it could be considered to add a new method, such as .replace_multiple(), which would work specifically for this new feature of multiple replacements using a dictionary. This would avoid any conflict with the current usage of the .replace() method.
Conclusion
I believe this feature could enhance the developer experience when working with strings in Python, simplifying the code and making it more efficient. I’m open to feedback and suggestions to refine this idea, especially about the performance drawbacks, replacement conflicts and more!
If this were merely about making it more concise, there’s not all that much benefit, but there’s another aspect of this that’s worth noting: enforced single replacement. Consider a “swap” operation, here demonstrated with Pike which has this exact feature:
Implemented with two separate replace() calls, this will end up making them the same (either “decdefg” or “abcabfg” depending on the order of replacements). Having a single dictionary to define the changes will allow this sort of thing to be done without reaching for a regular expression, with all the consequences of having to use regex.
Note that the status quo is to use either re.sub with an ad-hoc replacement function that maps the match to a replacement string, which is clunky to use and requires escaping special characters in the input, or str.translate, which supports only single-character translations.
In theory I don’t love the impact on the method signature. The current signature is
str.replace(old: str, new: str, count: int = -1)
And this would make it something like
str.replace(
old_or_dict: str | dict[str, str],
new: str | None, # or some other sentinel value
count: int = -1
)
Instead, there could be a new keyword-only argument for the translation dictionary, but I suppose this would still require defaults for old and new and some mutual-exclusion logic in case somebody passed in everything.
Another option is that this is a lot more like str.translate than str.replace, although translate has an idiosyncratic table input[1] and only allows single characters. Adding a new version with this interface might be nicer.
That’s true, but honestly it’s not writing the signature that is the issue. I think that signatures like this are harder to learn and keep track of, in general.
It depends on the nature of the function. If different signatures do completely different things, like those of type and iter, then I agree with you that it is better to make them separate functions. In cases like this and functions like max where different signatures do practically the same thing but take different types of input simply for convenience it actually makes these functions more intuitive to use and easier to keep track of.
By analogy with .format and .format_map, this could be called .replace_map.
Historically, this couldn’t be done because the order of application in the presence of overlapping patterns (or patterns that included later patterns in their output) would have been unpredictable when passing a built-in dict.
These days, dicts are insertion ordered, so the method can safely be defined as equivalent to:
modified = original
for k, v in replacements.items():
modified = modified.replace(k,v)
I personally would’ve slightly preferred .format to take a mapping with a second signature but yeah a separate function/method is just fine in the end of day.
Thanks for the historical insight. The order does matter when keys in the input are inclusive of another. For example, with {'a': 'b', 'ab': 'c'}, 'abc' would be become 'bbc', but with {'ab': 'c', 'a': 'b'} it would become 'cc'.
This code isn’t quite equivalent to what is proposed because it wouldn’t support swapping as @Rosuav mentioned.
With the iteration based definition, swapping is a three step operation:
target pattern → placeholder
source pattern → target pattern
placeholder → source pattern
The alternative would be to define .replace_map in terms of .format:
pattern = original.replace("{", "{{").replace("}","}}")
for idx, k in replacements:
escaped = k.replace("{", "{{").replace("}","}}")
pattern.replace(escaped, f"{idx}")
result = pattern.format(*replacement.values())
Either option would be a useful addition, but I agree the version that inherently supports swapping is more interesting.
There are different options to specify the behavior here, I don’t know which one is obviously best but it would have to be spelled out.
One is “ordering of the mapping” as you describe another option is to order the keys in some way to pick the best match: maybe you want the sort the keys, or pick the longest match. Depending on the chosen behavior "aab".replace({"a": "A", "ab": "B", "aa": "C"}) could yield AAb or AB or Cb.
The other alternative here is to allow choosing a match priority of leftmost or leftmost longest in the target, rather than basing the order on the replacement map. Either of these options open up more efficient implementations for the underlying search and replace, while still allowing ordered replacement by calling multiple times only when this is necessary, making the general case predictable and faster.
I like the idea of ​​having a new method called replace_map(...) to implement this feature, as @ncoghlan proposed. In my humble opinion, using typing.overload would be nice if the replace(...) method was as simple as max(...), however, it is not. I don’t think people would think of using a map as one of the arguments of replace.
replace_map would solve this clarity issue because we are explicitly declaring the type in the method signature, just like format_map does.
The advantage of using the iteration order is that it lets the caller explicitly control the priority order without having to come up with names for the different possibilities.
The downside is that the implementation might end up being slower, either because it always used a fully general pattern or because it is checking the replacement pattern order to see if it is ordered by length (whether ascending or descending).
combine the escaped patterns into a regex “or” pattern
assemble a string list consisting of the string segments between matches, and the target strings for the matched patterns
join the results
From re — Regular expression operations — Python 3.13.0 documentation, that would prioritise the patterns in iteration order due to the way | is defined. (Tangent: if we wanted to initially implement it that way, re.sub_map could be a decent spelling, and then the str.replace_map idea could be proposed later as a way of doing the same thing without the generalised re engine overhead)
Ordering a dict by key length is a bit messy, though: