Share dict of unpickable objects between processes

DamirHanov · May 2, 2023, 9:39pm

Hey everyone!
I have a project where I open a connection to servers in different processes.
Then I process data from them. To do it I have a dict with some complex objects.

I have a copy of a whole collection in each process. When I had small number of objects I didn’t face any problem with it. But now it starts to bother me in terms of memory consumption.

So I tried to pass them something like this:

class ServerInteractor:
    """
    A class to handle connection, and process data from server
    """
    async def main_loop():
        pass


servers = get_servers_from_settings()  # list with info about servers
my_collection = init_objects()

# helper function to init ServerInteractor, catch unhandled errors 
# and restart ServerInteractor if needed

with mp.Pool(len(servers)) as pool:
    pool.map(helper, (servers, my_collection))

But I get an error with unpickable field with type of compiled struct.Struct. I tried to use mp.Manager() and it’s dict() method to share it, but got the same problem (maybe I used it wrong).

So, my question is: how can I share this dict for all of these proccesess?

Also objects are “read only” in these processes – they don’t add/remove anything from dict, don’t modify objects in it and it is impossible to call method of one object from two different processes at one time, so I don’t need any sync mechanisms.

If you need better understanding of the objects they look similar to this:

class A:
    _struct: struct.Struct

class B:
    pass

class C:
    _as: list[A]

class D:
    _bs: list[B]
    _cs: dict[C]

# I need to pass dict[D]

Python version: 3.9
OS: Win 10

Rosuav · May 2, 2023, 9:52pm

It’s really REALLY hard to share dictionaries between processes. But let’s take a step back. Why is it that multiple processes are important, and can that be achieved in a different way?

Do the processes need to be independent, able to be individualy killed, memory-limited, etc? Or is this simply a matter of “need multiple CPU cores and processes are the only way to do that”?

Because if the latter, you might be in luck. There are some recent changes that allow multiple subinterpreters to run within a process, and then be spread across CPU cores.

The first thing to do would be to try converting your code to use threads instead of processes, and see how well it behaves. In general, I would recommend using threads rather than processes most of the time, since they’re lighter weight and scale far better (asyncio tasks scale even better, but can be harder to debug - though that, too, may be changing); once you run into problems, it’s worth considering multiple potential solutions.

DamirHanov · May 2, 2023, 10:17pm

Thank you for your answer. The main reason I chose proccess is exactly

“need multiple CPU cores and processes are the only way to do that”

In this project performance really matters, so it seemed to me as a logical solution to scale horizontally with processes.

Tomorrow I’ll do some benchmarks with ThreadPoolExecutor as it seems to be the fastest replacement in this case. Seems that it can solve my problem.

P.S. I’m curious about subinterpreters. Can you provide links to some helpful resources? Quick googling gave only draft of PEP 554 as a reliable source.

a-reich · May 2, 2023, 11:08pm

This isn’t super relevant as practical advice to the OP, but I was struck by this in your response - I assume you’re referring to PEP 684, “Per-Interpreter GIL”, which is expected to be implemented in Python 3.12. Even once 3.12 is out and there’s a decent concurrency interface to manage sub-interpreters with Python code, how would that solve the user’s problem since that PEP doesn’t cover anything about sharing or sending objects between interpreters? Indeed it says all objects with mutable state should not be shared. Even the draft text of PEP 554 doesn’t propose a way to send arbitrary use objects. So if these are dicts which can’t be pickled, how would they be sent?

Rosuav · May 2, 2023, 11:57pm

Okay, so the important point to note is that, starting with 3.12, this definitely won’t be the only way any more! Even today, there are some options, although not as good.

PEP 684 is also relevant here (and also landing in 3.12). Where PEP 554 makes it easier for pure-Python code to use multiple interpreters, PEP 684 makes it easier for multiple interpreters to use multiple CPU cores. However, this is largely future, unless (like some insane people ahem) you build CPython from source.

There are two things you can do right now, though. One is to start with threads, and then have those threads look at ways to release the GIL. That probably won’t be effective, but depending on the design of your code, it might be.

But the other is an option because you said that the objects are read-only. That implies that you should be able to fork after creating those dictionaries. That’s still using processes (with the overhead that that implies), but without the hassle of passing objects back and forth. You would have to take some care of other resource management, but it should work.

As of Python 3.12, your options will be WAY better though!

dstromberg · May 3, 2023, 2:45am

If you don’t have many cores or don’t need a lot more performance, you could just look into Pypy3, Shedskin and/or Cython. These tend to be low-hanging fruit.

If you want to pickle a struct, you might be able to convert it to a bytes and pickle that instead.

Shared, mutable state is generally a bad thing for concurrency. But if you really do need it, you might be able to re-architect things to use shared memory arrays.

Sometimes it works well to pass a dict from process to process using a multiprocessing.Queue. That will give distinct copies of a dict. This almost certainly still requires pickle.

CPython (and Pypy I think) threading is decent if you’re only doing n I/O-bound tasks. However, if you have n I/O bound tasks and add even just 1 CPU-bound task, all your threads develop performance problems.

I probably should add that micropython isn’t blazingly fast for uniprocessor workloads, but so far I’ve only seen it thread well. So if you just want scalable multithreading, you might give micropython a try.

HTH