In most other languages with threading API’s, there is a yield() function that you can call on the current thread. However, python’s threading library does not offer this method.
There is a lot of confusion online about how to yield with python threading library, as shown in the below sources. Some people think time.sleep(0) will do the trick, others think time.sleep(0.0001) is better and less buggy. This confusion is unnecessary. In addition, the accepted solution (doing time.sleep(0.0001)) isn’t necessarily always correct - what if there are zero threads? Also, it does not take into the account the current timeout set for each thread.
It would be much better if the threading library just offered the yield() function. I already have implemented this and am prepared to make a PR to cpython, but I just want to get confirmation that others believe this is a good idea too.
- multithreading - How does a threading.Thread yield the rest of its quantum in Python? - Stack Overflow
- In multithreading in python, how to yield for result and return thread value? - Stack Overflow
What precisely should it do? Would it be any different from sleeping for 0 seconds, and if so, how?
There will never be zero threads, but I presume you mean zero other threads. In what way is it wrong to sleep if there are zero threads? And what do you mean by “current timeout”?
Are you asking for python to document a reliable way to get the thread to be rescheduled?
I guess the use of yield is intended to allow a set of compute bound threads to have a “fair” share of the CPU resources?
Do you mean rescheduled inside the python threading code or at the OS level?
I was thinking that the yield() function would:
check if there are any other threads, and if there are not any, then just returns instead of sleeping (and maybe raises a warning that no other threads exist). I thought that this might help make some python applications more efficient by preventing them from sleeping when unnecessary.
If there are other threads:
- Get the currently set “switch interval”, i.e the amount of time allotted to each thread when it runs (see here)
- sleep for half the switch interval time (in order to give time for some other thread to get the GIL)
The implementation may not be too different than the currently accepted way, but I feel that providing this functionality could make it easier for developers who are used to having a yield() function in other thread api’s. Let me know what you think.
That is how python and most if not all thread APIs work already.
What I personally think is: I’ve never used any threading yield APIs in any language or system (they’re more usually what I’d see in a purely cooperative system like Python’s
asyncio or a UI event loop). It’s always just been
sleep(0) to achieve that goal. So the question is: Why is
time.sleep(0) not achieving the goal here?
It sounds to me like there’s a weird limitation - I hesitate to call it a bug since I don’t know enough about what’s going on, but it might well be - in the way that sleeping for zero seconds works. Another possibility to consider would be to have threads of different priorities (not something Python currently supports, and quite possibly available only on some platforms), which would allow a lower-priority CPU-bound thread to be interrupted by a higher-priority thread more easily, particularly if given a hint with
Many developers are used to having a yield() function in other thread apis such as the ones provided by Java and C++.
Python isn’t those languages. There are a multitude of ways that Python is not like other languages. It’s not typically a goal for one distinct language to match the syntax or API of any other language. You’ll be better off finding a different motivation for this feature.
It’s odd that sleep(0) seems obvious, but is not well defined in the docs, particularly on GNU/Linux systems…time — Time access and conversions — Python 3.11.3 documentation
Adding yield() would be excellent so we wouldn’t need to refer to the docs to check for defined behavior for the current paradigm of yielding.
You assume that people know the idea of yield.
I do not think you can assume that people will not need to read docs to know this.
The ideas of these words are embedded in the English language.
Sleep does not imply that the scheduler will get a chance to run. However, yield implies that this thread is temporarily stopping, which forces the programmer to think about what must come next.
Unfortunately, that simply isn’t true. Very very few technical terms are completely obvious purely on the basis of their non-technical use. And quite a few have a an “obvious” meaning that is deceptively close to, but distinct from, the actual technical meaning.
Clearly that’s YOUR intuition about the two words, and that’s great! But it isn’t everyone’s. My intuition, based on working with a wide variety of threading platforms and “thread-lite” systems (greenlets, cooperative task switching, etc), is that “sleep(0)” is a threading concept and “yield” is a non-threading concept (eg in purely cooperative systems). And in Python, “yield” has a very specific meaning - which doesn’t have anything to do with threading, but with generators; so people’s intuitions about yield will be coloured by that.
Ultimately, people WILL need to check the docs.
I’ve seen the term “yield” used variously in the context of a thread yielding the processor to another thread. For example, Python supports the POSIX function
os.sched_yield(). Its function is to “force the running thread to relinquish the processor until it again becomes the head of its thread list”. It’s intended for use with deterministic, realtime scheduling policies such as
SCHED_RR (round robin).
SwitchToThread() yields the processor for up to one time slice to another thread that’s ready to run. The Windows API implements it by calling NTAPI
NtYieldExecution(), which is implemented in the kernel by
KeYieldExecution(). Also, WinAPI
Sleep[Ex]() is implemented by calling NTAPI
NtDelayExecution(), which is implemented in the kernel by
KeDelayExecutionThread(). Given that the delay is 0; it’s not an alertable wait; and the thread has no pending asynchronous procedure calls (APCs), then
KeYieldExecution(). Thus Python’s
time.sleep(0) is implemented by NTAPI
So, a multi-platform wrapper for these two seems to be what the OP is asking for.
Now, not wanting to bikeshed, but bikeshedding: let’s not call it
Yeld please. (sched_yield
, whatever are good). There is enough overloading of meanings on the term type` already: this just is not good.
I have always use time.sleep(0) to give up a thread.
I am running a graphical application that does rendering, and
time.sleep(0) does not yield in my case: (Python 3.11 on OSX). I need to sleep for more than 5ms (in the render loop) for the application to yield.
When I don’t do rendering
time.sleep(0) works fine. I know that the rendering code can also call sleep from native code, so maybe that somehow causes the python sleep to not yield.
SO: An explicit
yield() seems like a good idea to me…
I was checking my understanding what a “yield” might be implemeted as and found that python already has a os.sched_yield() implemented.
Try using that function.
That has the same effect. Works outside the graphical context, does not work after the native render loop calls…
I should say that the test case is using firebase to subscribe to collection, which happens on another thread.
macOS scheduler must think that the yielding thread is still the best one to run I’d assume.
Why do you need to yield?