I think there are two separate topics here: 1) f-strings vs. explicit formatting requests, and 2) {}
-based vs. custom (%
-based and others) format specifications.
f-strings vs. explicit formatting
That’s called a string. You just put {}
placeholders in it.
No, it cannot do arbitrary calculations. F-strings support that to save some typing. When you are doing i18n in the real world, you do not want to do that kind of calculation on the fly. You want to look up some locale-specific strings and values substitute them into a template. (Sometimes the template will also have to be locale-specific.) If you don’t already have the values, then you just write the code to calculate them first, and store them with descriptive names, then do the actual interpolation. (Or if it’s simple enough, you stuff them into the arguments for the call to the formatter.)
In case you were not using Python before 3.6: this is done using the .format
method of the built-in string type.
>>> test = 1
>>> f'{test}'
'1'
>>> '{}'.format(test)
'1'
Yes; the format will be determined at runtime.
As a trivial example: some cultures prefer to write today’s date (at the time of day I’m writing this, it should be the same day in nearly all the populated places in the world) in the order 4-14-2023 and others as 14-4-2023. Even supposing that we don’t use the datetime
library at all, and just have three variables with those numbers in them, with the .format
method of strings we can write
date_format = '{month}-{day}-{year}' if month_first_locale() else '{day}-{month}-{year}'
And then later in the program do e.g.
today = date_format.format(day=14, month=4, year=2023)
And the date_format
string can be passed around the program like any other string, read from a file, etc. These things are impossible with f-strings. You cannot store the f-string in a file in a meaningful way, because there is nowhere to put the f
. If you used an f-string in the code that writes the file, then the current values are hard-coded into your saved data - you don’t have a reusable template.
So, f-strings don’t belong in the same category as all the other options. The common syntax shared by f-strings and the .format
method is in the syntax category, that is shared by the %
-based syntax used by the %
operator (and by the standard library logging
module, some SQL bindings, datetime.strptime
/.strftime
, etc. etc.) as well as other custom syntaxes.
The actual format-specification syntax
There is already some support for this. Please see the documentation, specifically the type
field:
>>> f'{100:f}'
'100.000000'
However, it’s often much less useful than one might like, and of course only a few types can be “privileged” in this way to an extent that would matter for the compiler. It wouldn’t work for library types like datetime.datetime
to try to “claim” a type specifier, because the parser wouldn’t be aware of them, much less the compiler.
Instead, the grammar allows the part after :
to have an arbitrary format, the “standard” in the doc notwithstanding, and this is eventually passed to the __format__
method of whatever is getting interpolated.
Now, my own thoughts, which are entirely about the second topic.
The {}
-based syntax is really nice for working with an overall string that needs to have multiple pieces of data formatted in. Many library authors seem to think of their types (like datetime.datetime
, whatever object represents an SQL query, etc.) as “single” pieces of data that might be formatted either separately or in a larger context. (The logging
module is the way it is only for historical reasons, I’m sure. After all, the data type there is just str
.)
So, they’ve invented (or emulated from an older source: e.g., an existing C library which either inspires the Python feature, or is implementing it under the hood while Python provides minimalist bindings) a variety of custom format specifications of their own. I’m generally not a big fan of these, like OP: they tend to be confusing (%m
vs %M
in datetime.datetime
formats is hard to remember, and recently I learned it is the other way around for Numpy!) and ugly ({}
is symmetrical and makes it clear what the bounds of the “placeholder” are; most %
syntaxes expect a single character, although you get weird compromises like Python’s old %(varname)s
) and redundant (in the syntaxes where %s
is a placeholder for a string, it will typically accept non-strings anyway, and the more-type-specific formatters might not do noticeably different things).
{}
syntaxes address these problems elegantly: it’s clear where the beginning and end are, it looks nice, and you only have to specify type conversions etc. when necessary (stuff like !r
to make Python use repr
instead of str
, or :f
to treat integers as float - a !s
is never required, so the common case is an empty string, rather than s
). Finally, it allows you to embed the custom syntaxes, as shown up-thread by @chepner.
But more importantly for contexts like datetime.datetime
, .format
already offers some limited destructuring:
>>> '{0[hello]}{0[hello]}'.format({'hello': 'world'})
'worldworld'
Similarly with attributes rather than dict keys.
I personally would like to see the standard library move more in this direction. While it is possible to write
>>> import datetime
>>> '{:%Y-%m-%d}'.format(datetime.datetime.now())
'2023-04-14'
I would on aesthetic grounds much rather follow the second example already:
>>> '{x.year:04}-{x.month:02}-{x.day:02}'.format(x=datetime.datetime.now())
'2023-04-14'
and I would like to be able to have a shorter way to do it. For example, if the str
class supported something like
>>> class Example(str):
... def format_attrs(self, obj):
... class _: # throwaway class to provide a method that uses `obj` from the closure
... def __getitem__(self, name):
... return getattr(obj, name)
... return self.format_map(_())
Then we could do (without the need to wrap in the subclass):
>>> Example('{year:4}-{month:02d}-{day:02}').format_attrs(datetime.datetime.now())
'2023-04-14'
(Similarly, it would be nice to have some mechanism to restrict, or supply separately, the environment from which f-strings draw names.)
I think this is a lot clearer (of course, it could also have shorter aliases for the property names, but definitely not the ambiguous m
). I don’t want to have to think about the type’s own formatting API, because the instance already has attributes which I already understand how to work with in the standard way. I don’t have to mentally correspond b
to months (for the name of a month - what on Earth??) And I don’t have to remember what the code is for, say, a numeric month value that isn’t zero-padded (trick question: there isn’t one).
If custom per-type formatters have any use here IMO, it’s for things like specifying a 4-digit vs 2-digit year, or a short vs. full month name, because that actually involves type-specific processing. Putting day/month/year in a specific order and putting literal hyphens or slashes between them, are boring, generic tasks and I don’t need or want the class’ help with them. I do want its help to know that e.g. MR rather than MA is the 2-letter abbreviation for March, or that Thursday as a single letter is R rather than T, at least, in the contexts where that’s true.
Of course, implementing that involves the datetime
module coming up with its own type to represent months, or days-of-the-week, which can stringize or format in various ways. Probably making use of enums, now that those exist. All of these seem like improvements to me. Just imagine (though of course the semantics could be defined differently; this is just what makes the most sense to me off the top of my head):
>>> april = datetime.datetime.now().month # now some enum type that implements `__format__`
>>> f'{april}' # raises a ValueError - ambiguous
>>> f'{april:s}' # April
>>> f'{april:3s}' # Apr
>>> f'{april:3S}' # APR
>>> f'{april:2S}' # AP ; but May would be MY, not MA
>>> f'{april:1S}' # raises a ValueError - no such valid abbreviation
>>> f'{april:d} # 4
>>> f'{april:02d}' # 04
(And this, of course, is why I specified the superfluous :4
for the year
in the previous example; my thinking is that a similar year
type could interpret :2
to take the last 2 digits - something that int
does not do.)