Deprecate str % format operator

Woah. The definition of “replace” is “to put something new in the place of”. Nice find :slight_smile: .

Given:

There should be one-- and preferably only one --obvious way to do it.

I’m starting to believe there is moral high ground supporting some version of this proposal.

PEPs are not documentation though; they are historical documents, capturing the ideas and goals of their author(s) at the time of publication. So what you have shown is that, back in 2006, it was at least ONE person’s intention to have this replace percent formatting. We’re two decades on from that now; if percent formatting is supposed to be on the way out, there would be references to that in the actual documentation.

3 Likes

Without disagreeing, I think it’s fair to point out the official tutorial has been calling it “old string formatting” since at least Python 2.7:

Python isn’t about the “moral high ground”. There’s a reason “practicality beats purity” is part of the Zen. You still haven’t addressed the fact that this will break significant amounts of existing code.

If you met a developer faced with going through a huge legacy codebase removing al uses of the % string operator, how would you explain to them why they needed to do this?

6 Likes

Yes, but “old” doesn’t mean “deprecated”. It’s merely to distinguish them.

1 Like

To add some context to all this: When the (then) new .format() templating was added, we deliberately did not deprecate the %-formatting.

Today, it’s better to point people to the new f-strings for many cases.

Even today, %-formatting and .format() templating still have their use cases, esp. when it comes to splitting definition of the template from its actual application. f-strings don’t support this kind of use case. Perhaps PEP 750 – Template Strings | peps.python.org will change this some day.

1 Like

But does “old” might mean that it’s ok to slightly reduce the performance of the old technique in order to enable new things?

I do see that removing the str % operator probably has a poor cost-reward ratio, but at the same time I can sympathize that, if you can already allow your class to act with y = my_class("y"); "x" / y and "x" + y it’d be frustrating to be unable to use "x" % y.

Suppose someone did go ahead and make a PR that solves OPs problem but it makes str % str x% slower, what are the odds it gets accepted?

Or would you argue that the logic

def str.__mod__(self, other):
  try:
    return self.oldformat(other)
  except TypeError("not all arguments converted during string formatting"):
    return other.__rmod__(self)

is so inherently fragile that it shouldn’t be enabled?

I can see some nasty footguns there: "x%s" % y could have rather unexpected behaviour, and y=(); "x" % y could be a annoying silent bug.

1 Like

I don’t believe this PEP would have been accepted if it had proposed deprecating the % string formatting operator.

PEP was discussed, reviewed and approved, without rewording of the Abstract section. Hardly it was just one person opinion — rather some roadmap, based on community consensus (at that time).

Perhaps, now we can say that this attempt to replace old-style formatting — failed. But it’s not so clear why…

There are, at some degree: “The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer formatted string literals, the str.format() interface, or template strings may help avoid these errors.”

2 Likes

Maybe it’s time we figured out how to get used to it and just accept that it’s probably here to stay:

1 Like

I’d say no. It’s acceptable to reduce performance of something in order to add useful new functionality, but that isn’t because it’s old - it’s because functionality is useful. But you have to pitch it with sufficient benefit in usability; what do we gain in return for the loss of performance?

The ability to have the second operand process modulo with a string is not, IMO, sufficient to justify this. But that’s just my opinion. In general, you definitely can pitch something as “this becomes slower, that becomes better”, as long as the pitch carries enough weight.

I think I was channeling PEP 461, which added %-formatting to byte containers to ease python2/python3 support:

This area of programming is characterized by a mixture of binary data and ASCII compatible segments of text (aka ASCII-encoded text). Bringing back a restricted %-interpolation for bytes and bytearray will aid both in writing new wire format code, and in porting Python 2 wire format code.

E.g. a protocol might be formatted in terms of ASCII identifiers, where it’s most natural to think about it in terms of a string. Often you need to have lots of formatting codes to get the spacing correct and IMO it works better with %-style formatting to keep the formatting separate from the code. I don’t like visually parsing very complicated f-strings that combine code and formatting.

3 Likes

Instead of explaining why the developer needs to make this change, I would instead start by offering him a superior option. As others on this thread have suggested, introduce a str.sprintf(*args, **kwargs) method. Since that method is more explicit, it doesn’t need to do as much type checking on the inputs as the str-% operator. There will be a small, but measurable performance improvement across all Python implementations. We might also be able to show some improvement for another domain such as static analysis. For "%s %s" % a, a lint tool needs to do more work figuring out whether a is an int, tuple of str, etc. Release a tool along with this new method to help users auto-convert their existing str-% usages to str.sprintf. It’s probably possible for a tool to auto-update at least 90% of these.

The next steps, including whether to deprecate the operator would depend on additional research. Maybe we can find a path that avoids deprecation. Maybe not. Be prepared for any changes to take years, which is perfectly fine and good. I was using the term “moral” jokingly. Large systems are governed by economics.

I realize that my original problem statement (enabling "foo" % x usage) isn’t exactly compelling people to storm the proverbial gates. In defense of that use case, I think Python is an excellent language for creating metalanguages. The larger project I’m working on is a hardware description language (HDL) embedded in Python: a meta-HDL. I have found using strings as literals to be extremely convenient. For example, here is one module from a RISCV simulation: seqlogic/tests/riscv/core/data_mem_if.py at main · cjdrake/seqlogic · GitHub.

Here’s an example of using literals on the left side of the expression:

self.expr(
    bus_wr_be,
    Mux(
        GetItem(data_format, slice(0, 2)),  # might change this to data_format[0:2]
        x0=("4b0001" << byte_addr),
        x1=("4b0011" << byte_addr),
        x2=("4b1111" << byte_addr),
        x3="4b0000",
    ),
)

If you’ve written Verilog before, this will look familiar and easy to adopt.

Apologies for advertising my yak shaving exercise, but I want to point out that some novelties can lead to interesting outcomes if you’re willing to clear a path.

Is this interesting enough to ask the community to expend a couple million man-hours of energy? Probably not. But it’s fun to talk about, and you never know until you ask.

3 Likes

My takeaway from this thread is that it’s worth pursuing a str.sprintf method because the %-format syntax still has its uses and the method is a better API than the % operator.

Whether the %-operator should be (soft) deprecated is a separate topic, but having str.sprintf available as an alternative makes that route much more viable.

5 Likes

What’s the process for proposing a new str method?

I’d start a new thread here in “Ideas” to present the argument cleanly and disentangle the scope from “deprecate str % format”.

Probably a PEP, like PEP 616 – String methods to remove prefixes and suffixes | peps.python.org. Which would require a core dev to sponsor it.

Start a new thread proposing that Python add some way for you to tell it that your __rmod__ should take precedence over the built-in __mod__.

Perhaps a decorator?

class MyNumericType(...):
  ...
  ...
  @takes_precedence
  def __rmod__(self, other):
    ...

I think that’d stand a good chance of actually being implemented and solving your problem :slight_smile:

I’d actually like to see something like this myself, too, so I’d be in support of this proposal. (I’d post it myself if it wasn’t for feeling that I’d be stealing your thunder!)

(I am also one of those people who likes using the “str % format operator”, despite being aware of the alternatives, so I’m glad to see the broad consensus that getting rid of it simply isn’t going to happen! I’ve always personally found it the “least worst” way to format strings. I might change my view once t-strings are available and all Python versions that don’t have them are EOL… but that’s many years away.)

Not sure what you would expect the decorator to do, but if you want something to have a chance of being accepted, you’ll need to define the steps the interpreter takes when evaluating x % y. Given X, Y = type(x), type(y), lay out what will happen in sequence.

1 Like

I think it’s pretty clear: “Add a complicated, but unlikely, jump condition to all occurrence of % operator.”. Clear -1 for me, no explanation needed

1 Like