Multithreading abuse for a deeper understanding | Self Solved

Hi,
So I tried to create a race condition in multiple ways, for better understand how they behave in Python.
To my understanding the GIL prevent from concurrent python code run in parallel, but the GIL is not secure concurrent access to the same resource.

I tried to create a program which yield a different value in each run using ThreadPoolExecutor, but it yield the same value each time…

The code is a very basic example: having a global variable that multiple threads change.
Here: p46 - Pastebin.com

I missing something here ?
Thanks !

Solved by using large number of threads and increments.

So I tried to create a race condition in multiple ways, for better
understand how they behave in Python.
To my understanding the GIL prevent from concurrent python code run in parallel, but the GIL is not secure concurrent access to the same resource.

Correct. It prevents Python bytecode instructions from running in
parallel, meaning that the internals can do lock-free access to the C
level stuff knowning that nobody else is doing that.

I tried to create a program which yield a different value in each run using ThreadPoolExecutor, but it yield the same value each time…
The code is a very basic example: having a global variable that
multiple threads change.
Here: p46 - Pastebin.com

Please paste a minimal version inline in your message; I’m on email
and had to go off to pastebin to look at things - plenty of people will
be unwilling to visit an unknown URL, and here in satellite internet
land visiting gpastebin is Very Slow.

If you stripped out all the logging (just use print() for a demo like
this) and dropped most of the whitespace it would fit nicely.

I would guess that your:

count = counter + 1

line is such a tiny fraction of the overall activity (logging etc etc)
that its likelihood or overlapping with another thread is very low, low
enough that you do not see it occur with the small number of threads
you’re using.

You could do better like this:

counter0 = counter
... do some stuff here, _including_ the logging ...
counter = counter0 + 1

This separates the read of counter from the update by more things,
allowing a much greater window for two threads to overlap this critical
action. Once you have observed the effects of the race condition (wrong
final counter value), you can observe a fix:

from threading import Lock
lock = Lock()
.... in the thread body ...
    with lock:
        counter0 = counter
        ... do some stuff here, _including_ the logging ...
        counter = counter0 + 1

This may run slower (assuming the logging I/O gets to parallelise, which
it may not), but would demonstrate the purpose of locking around a
critical piece of code.

In real applications one tries to minimise the time such a lock is held
to maximise the available parallelism, eg reshuffle:

    with lock:
        counter0 = counter
        counter = counter0 + 1
    ... do some stuff here, _including_ the logging ...

because the “some stuff” is not ciritcal racy code.

Cheers,
Cameron Simpson cs@cskk.id.au

It is very easy to demonstrate threading non-determinism in Python 2.
Race conditions such as deadlocks are trickier to demonstrate. (If it
was easy to set up a deadlock, it would not be as hard to find and debug
them.) But we can get some interesting non-deteministic results using
this demonstration script in Python 2:

# Python 2
import threading
from time import sleep, time, ctime
from random import random

def do_work(c):
    print 'starting loop', c, 'at', ctime(time())
    sleep(random())  # pretend to do some real work

def main():
    threads = []
    # Create some threads.
    for c in 'ABCDEFGH':
        t = threading.Thread(target=do_work, args=(c,))
        threads.append(t)
    # Start the threads.
    for t in threads:
        t.start()
    # Wait for all the threads to be done.
    for t in threads:
        t.join()

main()

When I run that, I get output like this:

[steve ~]$ python2.7 thread_print.py 
starting loop A at Sun Nov 14 13:03:18 2021
starting loop B at Sun Nov 14 13:03:18 2021
starting loop C at Sun Nov 14 13:03:18 2021
starting loop starting loop E atD  atSun Nov 14 13:03:18 2021 Sun Nov 14 
13:03:18 2021
starting loop 
F at Sun Nov 14 13:03:18 2021
 starting loop G starting loop Hat Sun Nov 14 13:03:18 2021
 at Sun Nov 14 13:03:18 2021

But this is because print in Python 2 is not thread safe. If I run it
under Python 3 (after fixing the calls) no matter how many times I try,
the print output is never interleaved. Each print output occurs on its
own line.

This is not the same as a deadlock, because we don’t have any threads
waiting on the result of another thread. But it does show how threading
is non-deterministic, and the output of one thread can get intermixed
with the output of another thread.