Pre-PEP: str.ensureprefix str.ensuresuffix

PEP: Add str.ensureprefix and str.ensuresuffix Methods


Abstract

This PEP proposes adding two new methods to the str class: ensureprefix and ensuresuffix. These methods will ensure that a string starts or ends with a specified prefix or suffix, respectively. If the string already has the prefix or suffix, it will remain unchanged; otherwise, the prefix or suffix will be added.

Motivation

String manipulation is a common task in Python programming. While Python provides methods like str.startswith and str.endswith to check for prefixes and suffixes, there is no built-in way to ensure that a string has a specific prefix or suffix. Developers often write custom utility functions for this purpose, which can lead to repetitive and less readable code.

For example, ensuring a string starts with a specific tag or ends with a specific delimiter requires writing code like this:

tag = "data"
if not tag.startswith("_"):
    tag = "_" + tag

text = "Hello"
if not text.endswith("!"):
    text += "!"

With the proposed methods, this can be simplified to:

tag = "data".ensureprefix("_")
text = "Hello".ensuresuffix("!")

This improves code readability and reduces the likelihood of errors.

Complementing removeprefix and removesuffix

The existing removeprefix and removesuffix methods (introduced in PEP 616) allow developers to remove a prefix or suffix from a string if it exists. For example:

tag = "_data".removeprefix("_")  # "data"
text = "Hello!".removesuffix("!")  # "Hello"

The proposed ensureprefix and ensuresuffix methods provide the inverse functionality, ensuring that a string has a specific prefix or suffix. Together, these methods form a complete and symmetric set of tools for prefix and suffix manipulation:

  • removeprefix: Removes a prefix if it exists.
  • ensureprefix: Adds a prefix if it does not exist.
  • removesuffix: Removes a suffix if it exists.
  • ensuresuffix: Adds a suffix if it does not exist.

This symmetry makes the API more intuitive and consistent.

Rationale

The addition of str.ensureprefix and str.ensuresuffix aligns with Python’s philosophy of providing clear, concise, and expressive syntax for common tasks. These methods complement the existing str.removeprefix and str.removesuffix methods, completing the set of tools for prefix and suffix manipulation.

Use Cases

  1. String Formatting: Ensuring strings meet specific formatting requirements (e.g., adding a dollar sign to prices).
  2. Data Normalization: Standardizing strings in datasets (e.g., ensuring all entries start or end with a specific pattern).
  3. Tagging and Labeling: Ensuring tags or labels have a consistent format (e.g., adding a prefix to identifiers).
  4. Delimited Data: Ensuring strings end with a specific delimiter for concatenation or parsing.

Existing Workarounds

Currently, developers use utility functions or manual checks to achieve this functionality. For example:

def ensureprefix(s: str, prefix: str) -> str:
    return s if s.startswith(prefix) else prefix + s

def ensuresuffix(s: str, suffix: str) -> str:
    return s if s.endswith(suffix) else s + suffix

While these workarounds are functional, they are less convenient and less discoverable than built-in methods.

Specification

New Methods

Two new methods will be added to the str class:

  1. str.ensureprefix(prefix: str) -> str
  • If the string starts with prefix, return the string unchanged.
  • Otherwise, return prefix + self.
  1. str.ensuresuffix(suffix: str) -> str
  • If the string ends with suffix, return the string unchanged.
  • Otherwise, return self + suffix.

Examples

# Ensure prefix
assert "data".ensureprefix("_") == "_data"
assert "_data".ensureprefix("_") == "_data"

# Ensure suffix
assert "Hello".ensuresuffix("!") == "Hello!"
assert "Hello!".ensuresuffix("!") == "Hello!"

Edge Cases

  • If prefix or suffix is an empty string, the original string is returned unchanged.
  • If prefix or suffix is longer than the string, the result will be the concatenation of the prefix/suffix and the string.
assert "hello".ensureprefix("") == "hello"
assert "hello".ensuresuffix("") == "hello"
assert "a".ensureprefix("longprefix") == "longprefixa"
assert "a".ensuresuffix("longsuffix") == "alongsuffix"

Backward Compatibility

This proposal introduces new methods to the str class and does not modify any existing behavior. As such, it is fully backward-compatible.

Rejected Alternatives

1. Using Utility Functions

  • While utility functions work, they are less discoverable and require additional imports or definitions.

2. Monkey Patching

  • Monkey patching the str class is discouraged due to potential conflicts and maintenance issues.

3. Subclassing str

  • Subclassing str to add these methods is possible but less convenient than having them directly available on all strings.

References

3 Likes

This seems unnecessary. I don’t think I’ve ever encountered a situation where simply checking and adding the prefix/suffix if necessary was insufficient. It’s only a couple of lines of code, after all. I’m not even sure I’d bother factoring it out into a utility function in most cases.

10 Likes

Been needing this occasionally.

Most common case is when I need to add "/" to signify that it is a directory.

1 Like

I occasionally want to ensure that a string ends with exactly one newline but then this proposal wouldn’t help in that case since it wouldn’t normalise foo\n\n to foo\n.

I think the number of random searching/splitting/merging/iterating/counting/normalising things you sometimes do with strings that you could justify adding a method for with almost exactly the same proposal as this one is too easily underestimated. If we took on every such proposal, the bloat to libpython3.so and the noisiness of the str.(...) namespace would be horrible.

7 Likes

I’ve actually needed this a couple times in a real world project. Note that this can already be spelled in a single line today with:

>>> "_" + "_data".removeprefix("_")
'_data'
>>> "_" + "data".removeprefix("_")
'_data'
>>> "data_".removesuffix("_") + "_"
'data_'
>>> "data".removesuffix("_") + "_"
'data_'

Although I don’t think this is important enough for stdlib.

4 Likes

Well, what if the path ends with "" on Windows? What if it is a disk letter? “c:/” is not equivalent to “c:”.

I only use unix, so have no idea about windows.

I do this in 2 cases:

  1. I often append / to a path string when it is a folder. E.g. I have pwd re-defined to do it as well.
  2. In configs I often use / at the end of dict key to signify that it is a sub-section.

But I have no strong opinion about this idea, just shared my use cases.

Github and stdlib use case analysis could be useful to see.

1 Like

I wonder if this could work such that

"3M".ensuresuffix("MiB") == "3MiB"
1 Like

This is not very generalizable since I don’t think you’d want to validate "3Mi" as "3MiB". Best to use a simple regex substitution for this use case:

assert re.sub(r"(?<=\dM)$", "iB", "3M") == "3MiB"

A whole group of str enhancements which are useful only in some projects could be solved if it were allowed to somehow extend the str type with new methods. It is impossible or almost impossible. When the .removeprefix was introduced I wanted to add a compatible implementation to str if an older version of Python was detected. In programs supposed to run on all maintained versions I could not use .removeprefix for years. I would appreciate some way to hook into str.__getattr__.

1 Like