Transifex translations are currently broken

It appears that a large amount of foreign-language text (likely Ukrainian) has been mixed into the Japanese translations. For example, the ftplib translation contains strings that are neither Japanese nor English, such as the following.

Якщо параметр timeout дорівнює нулю, це викличе :class:ValueError, щоб запобігти створенню неблокуючого сокета. Параметр encoding було додано, а значення за замовчуванням змінено з Latin-1 на UTF-8 на :rfc:2640.

According to Transifex’s revision history, these change seems to have been submitted by python-doc-bot on August 4. Would it be possible to revert this change?

2 Likes

See Undo bad translation propagation across languages · Issue #155 · python-docs-translations/transifex-automations · GitHub, cc @rffontenelle

Hello,

We are aware of the issue. Note that this affects all TX translations at the moment. There is ongoing work to fix this.

Please do not translate in the meantime, as it will be lost in the revert, as the entire project has to be reset.

@AA-Turner @Stanfromireland
Thank you for the info. Translation work is paused for now.

Thank you for your contributions.

Curious bystander here. What caused the situation? That issue doesn’t seem to say.

1 Like

I don’t know all the details. My understanding is that the difficulties began at some point between 11/7 and 16/7 (~ four weeks ago).

Several (~2/3) of the documentation translations use the Tranxifex translation platform. To make synchronising the translations between minor versions (e.g. 3.14 → 3.13, 12, etc) easier, a suite of tools have been developed. In general this makes things easier because source strings rarely change & so all the active branches can be updated at once.

The current understanding seems to be that one of the translation/pomerge caches wasn’t properly cleared at one point when the script was run, so the ‘translation memory’ for language A was updated with language B, C, or D’s strings.

The problem we have is with the Transifex platform – I believe the scripts & automation tools have been fixed, but the ‘Open Source’ plan that we are on doesn’t allow us to manage this translation memory feature which is causing the problems. To try and resolve this, the team have recreated the projects from scratch, but this hasn’t (to my knowledge) solved the root cause.

There are alternative translation platforms, but they don’t yet meet our needs – importantly having multiple git repositories, with one for the source strings (python/cpython) and others for the languages (e.g. python/python-docs-fr).

I’ve probably missed bits, but I think the above overview is mostly correct.

A

1 Like

Thanks, I was curious why there was a description of what to do but no (link to a) bug report. Maybe your community is small enough that that part was on Discord?

Anyway, what’s the Pro version cost that would address the issue? Maybe Python’s too big for a “free” plan (in scare quotes because you are paying the price).

“the scripts & automation tools have been fixed” makes me think though that maybe in a small part the scripts were to blame?

1 Like

I am a foreign student who had email alert on the keywords on the keywords of translation on the forum (I am studying low resourced language translations; esp. ancient chiness and east african languages, who is learning about transifex) - as far as I remember, I once read a small discussion regarding pomerge cache, though not remembering the exact source - the translation memory of a language used by other language - I maybe wrong but thought it might be a relevant information.

I have small question (if I can ask here) regarding whether the pomerge cache (though not the translation memory at the platform itself) can be flushed automatically via some scripts before it is given, or it is also automatic, set with some default options that are not modifiable by the platform itself.

I am asking as a learning student, so please bear with my question if it is an inappropriate one. Moreover, where would I be able to doodle with the issue, with my own machines ? As I want to look into the issue in order to learn further about such an automated (but apparently transifex itself can be only used for business or OSS accounts, hence the question..), multi-lingual translation support of the codes.

PS. I am such a big fan of this forum! although mostly I just read the threads

1 Like

Me too. The above is all information I’ve pieced together, I don’t think there is/was a central tracker for the problems, sadly.

Low-mid five figures USD per annum, from the public ‘price slider’. I hope to speak to the Transifex staff soon though to get a better idea for us specifically. Anything on that sort of price point is out of budget, though (or rather I’d be shocked if it was in budget!).

I would agree, at the very least not resilient to unexpected environment/cache state.

A

1 Like

Hello!

pomerge is a tool by Julien Palard, it is available on PyPI and the source can be found somewhere on AFPy’s git.

Moreover, where would I be able to doodle with the issue, with my own machines ? As I want to look into the issue in order to learn further about such an automated (but apparently transifex itself can be only used for business or OSS accounts, hence the question..), multi-lingual translation support of the codes.

I don’t know, maybe you can try setting up a Transifex open source org. There is also Weblate, which may be easier to try.

2 Likes

Yes, that is one of several issues. We would also exceed the Weblate free/opensource plan limit by ~50 times or so if I remember correctly.

For context for others, we have ~80,000 ‘strings’ (1.6m words) per minor version per translation, for around 25 million total[1] (there are 63 active languages on Transifex). Weblate’s highest pricing tier is 10 million, let alone their free tier!

A


  1. 500 million words ↩︎

2 Likes

For further information about Weblate, you can see this thread.

1 Like

Thank you so much ! I was actually looking into the repo and also watching the thread at Undo bad translation propagation across languages · Issue #155 · python-docs-translations/transifex-automations · GitHub last few weeks - such a big fan of such efforts for the multi-lingual translation ! Again thank you! (PS I think I should try first with weblate, as I am reading through the thread that you kindly shared, I actually share some feelings with the post - regarding some accessibilities and so on - thank you very much for the link!).

1 Like