i recently read about something related to asyncio, and thread safe just like everywhere, and that
led me to what atomic and thread safe really mean in Python.
from my shallow understandnigs, an operation is atomic as it takes one instruction to complete, and here the global interpreter is the one executing instructions in Python, and instructions are OP codes like LOAD_NAME, CALL_FUNCTION, etc.
and atomic operations are a way to achive thread safety.
and thread safety is for exclusive control, basically it means when i am doing something, accessing, modifying something, i have such a guarantee that no one has a chance meddling in, they have to wait.
and the GIL provides thread safety to some extent because only one thread can run at a time, but thread switching would still happen in between OP codes.
and i found this link
and i felt kind of confused about x = L[i]
is considered thread safe.
let’s take a look at the op codes for L.append(x)
,
In [72]: dis.dis("L.append(x)")
1 0 LOAD_NAME 0 (L)
2 LOAD_METHOD 1 (append)
4 LOAD_NAME 2 (x)
6 CALL_METHOD 1
8 RETURN_VALUE
append
takes one instruction to complete, so when there’s multiple threads calling append
but still
only one thread can acquire the GIL and actually execute append
, which is CALL_METHOD
.
or maybe there’s one thread calling L.append
, and another one is calling L.pop()
, and pop
takes one instruction to complete as well.
In [73]: dis.dis("L.pop()")
1 0 LOAD_NAME 0 (L)
2 LOAD_METHOD 1 (pop)
4 CALL_METHOD 0
6 RETURN_VALUE
even though two threads switch between LOAD_METHOD, say thread 1 pauses at LOAD_METHOD
, and the thread 2 runs to LOAD_METHOD
, and then we back to thread 1, there’s still only one thread holding the GIL in order to execute the op code CALL_METHOD
.
so there’s always one manipulating the list, either appending data to the list, or popping data from the list by executing op code CALL_METHOD
.
and here are op codes for x = x + 1
In [74]: dis.dis("x = x + 1")
1 0 LOAD_NAME 0 (x)
2 LOAD_CONST 0 (1)
4 BINARY_ADD
6 STORE_NAME 0 (x)
8 LOAD_CONST 1 (None)
10 RETURN_VALUE
suppose x was initialized to 0, and thread 1 is running, and it has finished LOAD_NAME, thread 1 knows that x is zero, and before it exectues BINARY_ADD, thread 2 acquires the GIL, and then thread 1 will be paused, and then thread 2 will be the running thread, and let’say thread 2 will execute LOAD_NAME, and BINARY_ADD before we switching back to thread 1, and then data corrupts.
so x = x + 1
is not thread safe.
and op codes for x = L[i]
In [75]: dis.dis("x = L[i]")
1 0 LOAD_NAME 0 (L)
2 LOAD_NAME 1 (i)
4 BINARY_SUBSCR
6 STORE_NAME 2 (x)
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
what about thread 1 pausing before BINARY_SUBSCR
but after LOAD_NAME L
?
suppose i
is was, and there was 3 elements in the list L
, and thread 2 might just pop the last element from L
, which would result in L
being shrinked to a size of 2.
then thread 1 resumed and executed L[2], then out of range.