Worker job slots in concurrent.futures.Executor

Working with concurrent.futures is generally a joy - I find it especially useful when writing shell script replacements that need to process lots of data in parallel without pulling out a heavyweight library. I also often hit a problem with inputs varying wildly in size, and processing bigger items is memory/storage-intensive enough that letting it go brrr on all CPU threads at once is an ill-advised idea.

What I usually do as a result is to determine if an item is big enough, then put small items into the concurrent executor, while big items get processed serially after those. It would be nice if I could just feed them into the concurrent executor too, allocating a number of “worker slots” so that the bigger items do not starve other jobs of their resources.

The API would look roughly like this:

  • `concurrent.futures.Executor` should expose a worker_count property that lets the user code scheduling jobs know how many workers are currently available and act on that accordingly.
    • One can always fetch this with private, backend-specific APIs or by saving max_workers somewhere else, but having this readily available on the executor makes things clear and easy, especially if you just leave max_workers set to the default value, which again varies on each Executor implementation.
    • For all executors, max_workers specifies the maximum number of workers, not guaranteed. Executor.worker_count would have to specify the actual number.
  • Executor.submit should have an extra worker_slots kwarg, defaulting to 1.
  • If worker_slots is greater than 1, submission should create worker_slots - 1 ”wait jobs” that wait for a signal from the primary job to complete. The primary job is scheduled as the final one (to make sure enough resources are reserved for it to complete successfully), and has a final callback that signals the wait jobs to stop hogging workers.
    • Example: You have jobs A, B, C using 1 slot each, and job D using 3 slots. What’s scheduled under the hood is A, B, C, DW1, DW2, D where DW* are waiting for D completion.
    • What should happen when worker_slots > Executor.worker_count? Exception (ValueError), or just quiet clamping slots to max workers?
  • This requires that the order in which jobs are scheduled on workers is the same as they were scheduled by the user. If you submit jobs A, B, C, they will never start executing on workers in any other order than the one above). If this is not the case, wait jobs listed above can hog workers indefinitely (for 3 workers and 3-slot jobs A, B generated jobs are AW1, AW2, A, BW1, BW2, B - if 3 workers get scheduled AW1, AW2, BW1, nothing ever completes). The “done callback” order is currently guaranteed by PEP 3148 and docs, but actual job scheduling is not.

I tried to find if anyone has proposed something similar, but could not find anything. Apologies if this was already discussed before.