Weakref'edobjects are discard too enthusiastically - how can I change this?

Pootle · May 31, 2022, 11:27am

I’m doing analytics on multiple images in parallel (around 100 , then another 100…)

If I use standard object references I fill my ram then start swapping which is OK for a while… I do want to access some of the 1st 100 again. I set the images (numpy arrays) as weakrefs and now python enthusiastically chucks them away when my ram is less than 30% used (I have 16Gb of RAM).

Can I tell it to hang onto weakrefs until much more of my ram is used?

daniele · May 31, 2022, 7:25pm

Weak references are references to objects that do not count toward the reference count used to garbage collect memory in Python. Thus your images are garbage collected as soon as there isn’t anything referencing them (other that the weak references). Memory pressure does not enter anywhere on the decision on whether to garbage collect objects. If you want to avoid the objects to be garbage collected you need to hold onto them via a (non-weak) reference.

stoneleaf · May 31, 2022, 7:53pm

Iain, welcome!

The Python Ideas area is for exploring possible future changes to Python itself. General questions should go in the Users category (I’ve moved it over).

Happy Pythoning!

Pootle · June 1, 2022, 6:58am

Ah! I misunderstood the point of weakref. Looks like I need an lru cache, but object size based…

jeanas · June 1, 2022, 8:17am

What would it then do if you happen to still need an image while it has already been collected?

The normal approach would be to keep a reference to an image just as long as you need it. Then you’ll always use just as much of the RAM as you need.

Pootle · June 1, 2022, 4:04pm

Well I’m running interactively in jupyter lab, and I don’t always know where I’m going next.

I have found a good tool though - based on functools.lru_cache. https://gist.github.com/wmayner/0245b7d9c329e498d42b

I’ve found it holds the last 150 images in store without the need to use swapspace