So I tried to create a race condition in multiple ways, for better
understand how they behave in Python.
To my understanding the GIL prevent from concurrent python code run in parallel, but the GIL is not secure concurrent access to the same resource.
Correct. It prevents Python bytecode instructions from running in
parallel, meaning that the internals can do lock-free access to the C
level stuff knowning that nobody else is doing that.
I tried to create a program which yield a different value in each run using ThreadPoolExecutor, but it yield the same value each time…
The code is a very basic example: having a global variable that
multiple threads change.
Here: p46 - Pastebin.com
Please paste a minimal version inline in your message; I’m on email
and had to go off to pastebin to look at things - plenty of people will
be unwilling to visit an unknown URL, and here in satellite internet
land visiting gpastebin is Very Slow.
If you stripped out all the logging (just use print() for a demo like
this) and dropped most of the whitespace it would fit nicely.
I would guess that your:
count = counter + 1
line is such a tiny fraction of the overall activity (logging etc etc)
that its likelihood or overlapping with another thread is very low, low
enough that you do not see it occur with the small number of threads
you’re using.
You could do better like this:
counter0 = counter
... do some stuff here, _including_ the logging ...
counter = counter0 + 1
This separates the read of counter from the update by more things,
allowing a much greater window for two threads to overlap this critical
action. Once you have observed the effects of the race condition (wrong
final counter value), you can observe a fix:
from threading import Lock
lock = Lock()
.... in the thread body ...
with lock:
counter0 = counter
... do some stuff here, _including_ the logging ...
counter = counter0 + 1
This may run slower (assuming the logging I/O gets to parallelise, which
it may not), but would demonstrate the purpose of locking around a
critical piece of code.
In real applications one tries to minimise the time such a lock is held
to maximise the available parallelism, eg reshuffle:
with lock:
counter0 = counter
counter = counter0 + 1
... do some stuff here, _including_ the logging ...
because the “some stuff” is not ciritcal racy code.
Cheers,
Cameron Simpson cs@cskk.id.au