Before posting this topic, I have search threading of python in Google.
I got to know that threading is not good for computing efficiency
case(such as image processing). I cannot believe that, so I post this
topic to confirm the comments/answers in Google.
Now that everyone knows GIL of python makes threading in python is not suitable for some cases, why doesn’t Python change the GIL rule to enhance Python threading?
The GIL makes compute-intensive-in-pure-Python not benefitted much by
threading, because only one piece of pure Python code can run at once -
the GIL is the single-big-lock pattern, used so that while it is held
you know that nobody else is doing any python-specific-stuff. You’re
free to manipulate the interpreter’s internal state without worrying
about races - allocate Python variables, whatever. (Not you the
programmer, I mean the Python interpreter.)
However, truly compute intensive tasks are poorly served by
interpreted languages, particularly dynamic interpreted languages - the
overhead of interpretation itself brings an order of magnitutde cost
beyond the theoretical limits of the hardware. We use it for its
superior expressiveness and memory safety etc etc.
People do do high performance computing using Python, for example
using Numpy and likely PyTorch. All these libraries are high speed /
lower level (usually, eg C or C++), which get good machine performance.
All of these are arranged around pieces of extension code which look
like this:
# inside the Python interpreter
do some Python things to set up
release the GIL
... do high speed C stuff here, ideally as much as possible ...
reclaim the GIL
update Python interpreter state
I’ve only written one serious piece of C extension code, here:
https://hg.sr.ht/~cameron-simpson/css/browse/lib/python/cs/vt/_scan.c?rev=tip
You can see the pattern above in the scan_scanbuf() function; the chunk
in the middle:
Py_BEGIN_ALLOW_THREADS
unsigned long offset = 0;
unsigned char *cp = buf;
for (; buflen; cp++, buflen--, offset++) {
unsigned char b = *cp;
hash_value = ( ( ( hash_value & 0x001fffff ) << 7
)
> ( ( b & 0x7f )^( (b & 0x80)>>7 )
)
);
if (hash_value % 4093 == 4091) {
offsets[noffsets++] = offset;
}
}
Py_END_ALLOW_THREADS
It does some Python internal setup, releases the GIL at
Py_BEGIN_ALLOW_THREADS, does a pure C scan of a memory buffer,
in this case potentially quite large, then reacquires the GIL at
Py_END_ALLOW_THREADS, and updates the Python state before returning the
the outer Python programme.
While the pure C chunk above is running, other Python threads can
execute at the same time.
While the GIL is released, the C stuff runs at full speed in that
thread and other Python threads are free to execute at the same time.
If things are arranged well, this can produce good multithreaded
performance for compute intensive stuff. Likewise I/O bound stuff - the
interpreter releases the GIL while waiting for significant I/O, so that
other threads run freely while this thread is blocked.
GIL makes Python threading like a well-known Chinese saying:
Python threading like chicken ribs. It’s tasteless to eat, but it’s a pity to abandon
No, it just means you need to do the right things. Even for high
performance stuff, Python’s great for orchestration, and Threads are one
form of orchestration.
Cheers,
Cameron Simpson cs@cskk.id.au