Support t-strings in gettext

ThiefMaster · August 3, 2025, 9:37pm

Wouldn’t you like to write your i18n code using gettext like this?

msg = _(t'Hello {name}')
msg = ngettext(t'{n} snake', t'{n} snakes', n)
flash(_(t'Category "{cat.title}" moved to "{target_cat.title}"'))

instead of this?

msg = _('Hello {name}').format(name=name)
msg = ngettext('{n} snake', '{n} snakes', n).format(n=n)
flash(_('Category "{}" moved to "{}"').format(cat.title, target_cat.title))

Because the latter isn’t super pretty to begin with and prone to errors such as calling .format() inside the gettext call instead of outside, or using positional multiple placeholders (and some languages may require a different order).

Now to the tricky part: How to convert the expressions from the t-string interpolation to names that are suitable in format strings and translations.

My proposed way to solve this is to derive a format string field name from the expression, and reject anything that’s too complex or too dynamic. My current implementation covers simple cases such as plain variables, accessing attributes and items, and calling a method w/o arguments:

github.com/python/cpython

Support t-strings in gettext

opened 07:03PM - 03 Aug 25 UTC

ThiefMaster

type-feature stdlib

# Feature or enhancement ### Proposal: Currently i18n in Python usually looks …like this: ```py msg = _('Hello {name}').format(name=name) msg = ngettext('{n} snake', '{n} snakes', n).format(n=n) ``` This isn't super pretty and prone to errors such as calling `.format()` inside the gettext call instead of outside. It also leads to people being lazy and creating strings that may not work properly depending on the target language, because multiple positional placeholders are used: ```py flash(_('Category "{}" moved to "{}"').format(cat.title, target_cat.title)) ``` f-strings obviously do not work, because it'd be just like calling `.format()` inside gettext. But now that we have t-strings, this can be solved: ```py msg = _(t'Hello {name}') msg = ngettext(t'{n} snake', t'{n} snakes', n) flash(_(t'Category "{cat.title}" moved to "{target_cat.title}"')) ``` (Maybe an interesting fact: A less powerful version of this exists in Jinja2 templates for a very long time - it lets you use plain variables inside `{% trans %}` and uses the variable name during extraction, but the value during translation.) Using t-strings for i18n also provides a bunch of advantages, such as avoiding positional arguments and shorter line lengths. The tricky part is of course how to convert the expressions from the t-string interpolation to names that are suitable in format strings and translations. Obviously nobody wants "real" code inside a gettext .po file. My proposed way to solve this is to derive a format string field name from the expression, and reject anything that's too complex or too dynamic. My current implementation (I will open a PR shortly after opening this issue) covers simple cases such as plain variables, accessing attributes and items, and calling a method w/o arguments. ### Has this already been discussed elsewhere? There is no discussion (yet) because I actually started hacking on this feature during EuroPython to see if it's feasible at all, and I only read the issue template here when I was about to open a PR. If you believe there should be a discussion about this in the forum, I'll be happy to open a thread. ### Linked PRs * gh-137354

PS: Maybe an interesting fact: A less powerful version of this exists in Jinja2 templates for a very long time - it lets you use plain variables inside {% trans %} and uses the variable name during extraction, but the value during translation.

barry-scott · August 3, 2025, 10:10pm

This example cannot be translated because the two fields need to be swapped in some language translations.

This uses the %(name)s style to allow for this and is supported by xgettext(?).

flash(_('Category "%(cat)s" moved to "%(title)s"') % {'cat': cat.title, 'title': target_cat.title})

ThiefMaster · August 3, 2025, 10:21pm

Yes, it was intended as a bad example that is avoided altogether when using t-strings: “or using positional multiple placeholders (and some languages may require a different order).”

The easiest way to do this correctly (w/o t-strings) would be to simply use named placeholders (or explicit positional placeholders ({0} and {1}), but those would easily confuse translators)…

barry · August 3, 2025, 10:40pm

We explored the use of t-strings for i18n back when PEP 750 was under development. We determined it was not a good fit and was outside the scope of t-strings, both of which are totally fine. The right tool for the job, IMHO is my library flufl.i18n which was originally developed for GNU Mailman, still used there, but useful in any other context. It’s built on top of stdlib Template strings which uses $-syntax which is much more friendly to translators than %-strings^[1]. GNU gettext supports Python directly, so that toolset is well designed for use in i18n’d Python code.

and was developed exactly because translators understandably got %-strings wrong in translations ↩︎

ThiefMaster · August 6, 2025, 9:07am

We determined it was not a good fit

What made it not a good fit? Simply the fact that more complex expressions would not work well in there?

and was outside the scope of t-strings

Keeping it outside the scope of PEP750 made perfect sense IMHO. Doesn’t sound like a good reason against adding it independently from this…

The right tool for the job, IMHO is my library flufl.i18n

At least in the Python webapp world I have yet to see any major webapps that use anything but Babel… So realistically, pretty much everyone uses something else.

A quick search on GitHub also confirms this: 32 matches for flufl.i18n vs over 125k matches for babel in common Python dependency files

https://github.com/search?q=flufl.i18n+path%3Arequirements.txt&type=Code https://github.com/search?q=flufl.i18n+path%3Asetup.py&type=Code https://github.com/search?q=flufl.i18n+path%3Apyproject.toml&type=Code https://github.com/search?q=babel+path%3Arequirements.txt&type=Code https://github.com/search?q=babel+path%3Asetup.py&type=Code https://github.com/search?q=babel+path%3Apyproject.toml&type=Code

and was developed exactly because translators understandably got %-strings wrong in translations

Yep, they do. But regardless of the syntax, the main mistake I see is people translating placeholders…
In any case, applications like transifex (and probably weblate as well?) tend to highlight placeholders separately to make this less likely to happen.

And in fact, my proposed solution using t-strings actually avoids all the problems that come with more complex format placeholders, because only the name is part of the extracted strings - any format/conversion specs live only in the code, and thus translators never see those. So from the simplicity level it’s literally $foo (w/ flufl.i18n / string.Template) vs {foo} (w/ t-strings). Both looks equally easy to get right (or wrong) for translators… At least babel also adds python-brace-format to the pot file metadata for these strings, so tools used by translators know what kind of placeholders to expect.

mikeshardmind · August 6, 2025, 9:16am

You may want to compare what is possible with t-strings and what fluent does to be capable of handling languages that don’t map 1:1 with English and how it seperates what developers need and what localizers need better than extracting strings from source code in any format. I would not recommend t-strings here, nor would I recommend any large project go with a gettext based solution for new localization (yes, don’t throw away the existing work if you have already managed to get your app localized)

ThiefMaster · August 6, 2025, 9:26am

Yeah, gettext is not as powerful as you sometimes need for perfect translations since you can’t take into account gender, etc. But realistically it’s still what nearly everyone uses… I don’t think it’s really in scope here?

FWIW I completely agree that t-strings make no sense for this type of translation, where you usually just use some identifier for a message, and have the message w/ all the details, conditions, etc. outside the Python code.

mikeshardmind · August 6, 2025, 9:38am

Okay, more directly: Even if there weren’t other issues^[1] with t-strings for i18n that make it worse than gettext in it’s current form, I wouldn’t think it’s worth changing gettext to support this, partly because it would be ideal if we weren’t pushing users towards gettext-based solutions.

Here’s a direct link to where it was explained in more detail from the discussion during pep 750’s development about it not being a good fit PEP 750: Tag Strings For Writing Domain-Specific Languages - #85 by barry ↩︎

encukou · August 6, 2025, 11:29am

To me, this is what makes this proposal better for a PyPI package than stdlib. There is no single obvious answer, and in the stdlib we only have one chance to do it right. A package on PyPI, on the other hand, can be much bolder in picking something that works, change later (with backwards-compatible improvements that don’t require a new Python version, and incompatible changes that won’t hold people back from upgrading all of Python), or even offer customization.