Why doesn't sys.stdout.flush() call os.fsync?

I’ve been gradually trying to improve Pyodide’s IO and I’ve been confused for a long time about this point. I generally would expect that sys.stdout.flush() calls os.fsync() but it does not. Pyodide implements a buffered output device that does not write out any data until fsync(1) is called or a newline is written. This output device is closely based on an output device in Emscripten.

I have a few questions:

  1. Is there a good reason for the decision that sys.stdout.flush() does not call os.fsync()? It seems to have been this way for a long time. Is there a place where I can read about the rational for this? Or if not, can someone explain it to me?
  2. Is an output device that buffers output until fsync is called weird, badly behaved, or unusual?
  3. Is it a common pattern for Python code to call sys.stdout.flush() and then os.fsync(sys.stdout)? Do people often mean to use the first but not the second?

Obviously for backwards compatibility reasons it’s way too late to change the Python design here. I’m just trying to understand whether Python is being weird here or if my output device is messed up.

See: Batched stdout handler doesn't get called when stdout is flushed · Issue #4139 · pyodide/pyodide · GitHub
cc @pitrou

The purpose of flush to force the data out of python’s buffers into the OS.

The purpose of fsync is to force the OS buffers in to the media, disk.

There is a high performance cost of calling fsync that would be a big problem if flush invoked it.

It is rare that fsync is required by an application.
It would be wrong to force its use when flush is called.

1 Like

Okay, thanks for the explanation!