Bug with multi-threaded print/write causing dropped output, output of uninitialized memory, assertion failures

Hi, I’d like to request a maintainer please take a look at this github issue:

There are race conditions in Modules/_io/textio.c (_textiowrapper_writeflush and _io_TextIOWrapper_write_impl specifically) that causes some output to be dropped and replaced with uninitialized memory, and then trigger an assertion failure on debug builds.

I’ve provided a test case that reliably provokes the issue for me, and a patch that mitigates the issue somewhat (enough that the triggering use case we saw in-house is now stable), but does not fully fix the issue. Having someone who’s more familiar with cpython internals devise a complete solution would be appreciated.

Thanks!

Is it relating to this?

That seems to be addressing a similar issue (same lines of code are being edited), but provoked by a more deterministic approach. I don’t think that it addresses the race condition.

I’ll respond in more detail on that PR. Thanks!

Perhaps it’s time to introduce a mutex to protect the TextIOWrapper object’s internal structures? I don’t think trying to make the algorithm race-free otherwise is a robust approach.

Also, I’ll note that “output of uninitialized memory” is a potential security issue (“uninitialized” memory could be previously-allocated memory that contained secrets).

cc @gpshead for visibility.

3 Likes

It sounds right to me that a mutex is the way to fully solve the problem, but don’t trust my C skills to do it properly.

I have a feeling that changing the internal buffer implementation to some sort of persistent FIFO queue or ring buffer might be a better choice than repeatedly creating and then deallocating a list in terms of minimizing how much of the operation needs to be covered by a mutex. Not sure.