Input() with wide characters (jp or cn) is not working properly on Mac/Ubuntu terminal

Similar to macos - python input() with wide characters (jp or cn) is not working properly on Mac terminal - Stack Overflow

I use a simple code to reproduce (save it in a py file, it works well in Interactive Mode):

a = input()
print(a)

You can copy “一二三四五六七八九十” and paste in the terminal and use backspace to delete it. You can see the mismatch bewteen the terminal and actual input.

I test it on my macbook, ubuntu machine and aws ec2. Both macos and ubuntu can’t delete after “一二三四五” and ec2 can delete all, but I need to use backspace twice to delete one character.

Both my terminal and python use UTF-8.

1 Like

I could not reproduce this on my Ubuntu-based Mint desktop.

Did you use the Interactive Mode or save the code in a py file? I just updated the description.

Ah. Yes, I see it happening - the backspace simply doesn’t go back far enough. If I backspace ten times, 一二三四五 still shows on the echoed display, but the a value is an empty string as expected. Reproduced both by copy-and-paste, and by using my Japanese IME (fcitx+mozc).

It is not a python error but a termial error.

English need one char, but Chinese need two in some encoding.

paste your ‘locale’ here

➜  ~ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

Thank you!
This is my mac

locale
LANG=""
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=

This is my ubuntu machine:

LANG=C.UTF-8
LANGUAGE=
LC_CTYPE=C.UTF-8
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_PAPER="C.UTF-8"
LC_NAME="C.UTF-8"
LC_ADDRESS="C.UTF-8"
LC_TELEPHONE="C.UTF-8"
LC_MEASUREMENT="C.UTF-8"
LC_IDENTIFICATION="C.UTF-8"
LC_ALL=

Okay, I think that’s your problem :slight_smile: Switch to an actual locale, instead of using “C” locale, and then you’ll have real character support in your terminal. For example, I use en_AU.UTF-8 as my default locale, and I have a couple dozen other locales installed as well (mainly for testing - I don’t speak Arabic, but I have an Arabic locale installed so I can flip something into RTL mode to make sure it works). The terminal respects this and will handle characters much more correctly.

Here is my new locale:

locale
LANG="zh_CN.UTF-8"
LC_COLLATE="zh_CN.UTF-8"
LC_CTYPE="zh_CN.UTF-8"
LC_MESSAGES="zh_CN.UTF-8"
LC_MONETARY="zh_CN.UTF-8"
LC_NUMERIC="zh_CN.UTF-8"
LC_TIME="zh_CN.UTF-8"
LC_ALL="zh_CN.UTF-8"

But it still can’t work.

It is not that easy, maybe you should ask OPS who is around you.
Your problem is a enviroment config problem.

If you are using teriminal in mac, and connect to Ubuntu via ssh.
There is teriminal config/ssh config/Ubuntu Server config , anyone of there config went wrong cause the problem you describe.