I’m working on a Python wrapper around 3rd part command-line tool and need to exchange data with it via stdin/stdout. So I use subprocess.Popen()
to start a process and then write()
/readline()
to send data or retrieve result. Here is simplified code
import subprocess
command = ['/path/to/executable', 'arg1', 'arg2', 'arg3']
instance = subprocess.Popen(command,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.DEVNULL,
universal_newlines=True)
# pass data and command to the tool
instance.stdin.write('/path/to/data\n-command\n')
instance.stdin.flush()
# get results
instance.stdout.readlines()
Unfortunately, this does not work in Windows environment (Python 3.7.0 and Python 3.8, cmd.exe) if file path contains non-ASCII characters, error is
UnicodeEncodeError: 'charmap' codec can't encode characters in position 53-59: character maps to <undefined>
As I understand, this is because I’m using universal_newlines=True
and Python tries to decode process output using locale.getpreferredencoding(False)
that is not utf-8 on Windows. And when file path contains non-ASCII characters which are not available with current system encoding (e.g. cyrillic on German system) error is raised.
So I tried to remove universal_newlines=True
and decode/encode process input and output manually. But can not find a crossplatform way to find out which encoding to use. I will be grateful if you can point me in the right direction or provide small example.