Feature or enhancement
Proposal:
Consider the case that you get some exception during unpickling. This could be anything, in maybe your custom object __setstate__
or whatever else. For example, we got this crash:
...
File "/u/dorian.koch/setups/2024-10-11--denoising-lm/recipe/returnn/returnn/util/multi_proc_non_daemonic_spawn.py", line 156, in NonDaemonicSpawnProcess._reconstruct_with_pre_init_func
line: reconstruct_func, reconstruct_args, reconstruct_state = pickle.load(buffer)
locals:
reconstruct_func = <not found>
reconstruct_args = <not found>
reconstruct_state = <not found>
pickle = <global> <module 'pickle' from '/work/tools/users/zeyer/linuxbrew/opt/python@3.11/lib/python3.11/pickle.py'>
pickle.load = <global> <built-in function load>
buffer = <local> <_io.BytesIO object at 0x74bbaa61e610>
File "/work/tools/users/zeyer/linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/synchronize.py", line 110, in SemLock.__setstate__
line: self._semlock = _multiprocessing.SemLock._rebuild(*state)
locals:
self = <local> <Lock(owner=unknown)>
self._semlock = <local> !AttributeError: 'Lock' object has no attribute '_semlock'
_multiprocessing = <global> <module '_multiprocessing' from '/work/tools/users/zeyer/linuxbrew/opt/python@3.11/lib/python3.11/lib-dynload/_multiprocessing.cpython-311-x86_64-linux-gnu.so'>
_multiprocessing.SemLock = <global> <class '_multiprocessing.SemLock'>
_multiprocessing.SemLock._rebuild = <global> <built-in method _rebuild of type object at 0x74bbb60322c0>
state = <local> (132092164476928, 1, 1, '/mp-2wkdacg_')
FileNotFoundError: [Errno 2] No such file or directory
(The exception traceback was printed using better_exchook for an extended traceback which adds the involved local variables and some more.I thought the additional info might help to better understand where this exception comes from.)
So, SemLock.__setstate__
fails here for some reason. Maybe some race condition. But when I saw this crash, my first thought was, where actually do we have a SemLock
inside the pickled object?
So, this is what I would like: In case of an exception during unpickling, it can show me the object path during the construction which lead to this object. (In case there are multiple references to the object, just show me the first.)
I’m not sure exactly how this should be done. It means some overhead. For every single object that pickle creates, we would need to store the creating parent object + name/index/whatever. So maybe this is a feature which should be optional. It would be fine for me if I run unpickling first without, and if I get some exception, I run unpickling again with this debug flag enabled. Maybe it’s also fine if this is only in the pure Python implementation.
Maybe I can already do sth like this by checking the local self
in the stack frame where the exception occured and then using gc.get_referrers
to get back to the root?
I also don’t care how it would show this information, or how I can get this information on the object path, or how it would represent this information, as long as I have any way of getting this information. I leave this open for suggestions. E.g.:
- My first idea was that it could simply extend the
exception.msg
, but maybe that’s too hacky. Sth like:
try:
_load(...)
except Exception as exc:
exc.msg += f"\n\nGot exception during unpickling of object {self._recent_obj_info()}"
raise
- Or it could wrap any exception thrown during unpickling, and do sth like this:
try:
_load(...)
except Exception as exc:
raise UnpicklingException(f"Got exception during unpickling of object {self._recent_obj_info()}") from exc
(But this would change the current behavior, so this would definitely need some flag to enable this behavior.)
- Or it could expose such
Unpickler.recent_obj_info
, and then I can get that info in my own exception handling code and do what I want with it.
(Note, I already posted this to GitHub #130621, as I thought this is a minor feature, but it was suggested to post this here as well.)