Seeking Thoughts on Strange Exception Problem with class and dynamically loaded class module

Hi Folks, First Time on the forums but I’ve got my first issue in four years I just cannot see past and that’s probably because I’m doing something that’s probably less common and using some esoteric services I don’t fully understand.

Quick background (mainly so you can measure our communication), it’s an easy skip if you’d rather, I’ll keep this short and in one paragraph only. I’ve been in the software industry since college in the early 90’s. Honestly I never really saw python as a serious language, being an old C/C++/Pascal programmer. I started working contract after college, Houston has lots of opportunity for small tech solution providers for big firms and industries, oil and gas, aerospace, etc. I got into oil and gas. But four years ago two of my kids went off to military academies and they started using python. I took that time to learn the language to be able to follow along, discuss and offer thoughts to my girls if they needed it. And I must admit I did fall in love. Mostly because it’s fast and easy to prototype and or get a script working just to get something done. For example, I stopped even trying to learn the clusterfu%(*k that is Microsoft’s power shell, python fits the bill for a desktop scripting language to empower users to write for their own immediate needs. I’ve got tons of custom scripts now to do things I want done on my desk, rather than .bat or .ps1 scripts!

Lastly, I don’t know everything, and I don’t know everything about python for sure, but you need to know that I wasn’t born yesterday either, so if I don’t specify everything exactly, unless it’s a critical part of the problem, you should probably not start by assuming I missed it. I’m not going to post whole chunks of code, in fact, if I post any it will probably be analogous or pseudo-code. I don’t usually need actual code help, where I tend to need help is either way more technical or principled. Also, this is beyond just a short code snippet or two, and I don’t really expect anyone to spend that amount of time knee-deep in my code, but my spidey sense tells me I’m bending something fundamental that I don’t fully understand about python, so mostly this is probably going to be a discussion on principles and such. And many TIA if you decide to follow along.

Now don’t kill me, it’s been a long time since I’ve actually used UML and since this issue may involve not only the python importing framework as well as class interaction I did my best to hack together something UML like that will help visualize my entities. You can see that diagram here, lj_diagram

Description of the project:

I’ve got a main console script, that handles the command-line, switches, etc., and creates an instance of a Core class, let’s call it Juicer. Juicer needs to interface with some data stores of different types and does so thru a private object we’ll call a DataMngr. But instead of the Juicer knowing beforehand what DataMngr’s are available and which one to create to interface with, I’m using a class factory pattern, call it the DataMngr_broker, that figures out what data manager to create and give to the Juicer. I wanted DataMngrs for specific data stores to be able to be created later so I’m using importlib to dynamically load any modules found in a sub directory, call a registration function, and register the type of store and the name of the class that should be instantiated for that type. So DataMngr’s are inherited from an abstract base class and since the Juicer cannot know what DataMngrs are available, it only imports the ABC module. My spidey sense tells me it has something to do with the technical side of dynamically importing these modules that is causing my problem, but I’ll get to the problem in a moment.

It’s important to know that this works! In fact, it works great. I can place new modules, that define new DataMngr child classes from the ABC, drop them into the sub directory, and the Juicer can assess and interact with that data store perfectly. I researched how to use importlib, I’m on python 3.11, and with python changing so fast, and so much legacy and older examples and snippets scattered across the interwebs, I tried hard to use importlib with the latest recommendations.

The Problem:

So, once this framework was up and working, and it might be significant to point out that originally before creating this dynamic loading, I was working with one specific data store type first to get everything working so most of my exception framework was already in place and working fine. But, with the new dynamic loading/importing, I was doing some cleanup and refactoring on some of my exception classes, you know, consistent error messages, at what level is this sort of exception handled, etc. AND testing the exception handling of the DataMngrs, when I noticed that if the Juicer tried to use a store that the DataMngr could not initialize correctly for, say a corrupt file, or a bad signature, this exception wasn’t propagating and being caught by the try block in the juicer where the DataMngr_broker is created and then is called on to create the correct DataMngr child class.

This is unequivocal. I can see in my logs, precisely where I’m raising this exception, and yet back at the Juicer, rather than the exception being caught, process moves to the next line after this line that results in the exception further down – the creation and return of the child DataMngr. At which point, because the DataMngr encountered a bad file, I get general python exceptions and a traceback for other errors because of this, none of which is helpful as this is all downstream of where the exception is not caught.

What’s interesting is that the exception at least does start being raised. This is happening as the child DataMngr class is being created and initialized, and processing absolutely stops after raising an exception, as there are many other things that still need doing at this point but it is abruptly interrupted and control moves up the stack, thru the broker and back to the Juicer and the try block in question, and control moves on to the next line, but at this point, I don’t have a reliable DataMngr. Although, the class IS created and valid in that sense, internally it’s members are not. For example, after validating the data store, it opens and maps the file. Of course, none of this, and none of the child DataMngr’s internal members that manage this are valid. But, I can confirm, at least, that at this dynamically loaded module and the class created from it, the exception chain is started.

At first I thought this was a namespace issue. The DataMngr exceptions are defined in the ABC module. So, while Juicer needs to import the ABC module to know how to talk to DataMngr’s in a general sense (polymorphism), so do the modules that define the child classes, because they need the ABC class declaration, AND they need the exceptions to throw for all DataMngrs.

But I short-circuited my custom exception classes and just tried raising a general purpose Exception, of course I provide my own error message strings. But to no avail, raising a general Exception in the class created from the dynamically loaded module, does start, and the class initialization is stopped and control returns to the Juicer class, again exactly as described, but the exception is still not caught.

So that knocks out a namespace issue with respect to the custom exception classes, but it still feels like I’ve not connected something completely right in doing the dynamic module import. However, rather than just start assuming that must be it, we cannot forget, if no exceptions get thrown, the dynamically loading modules DO in fact WORK.

Each data store type has it’s own child DataMngr inherited from the ABC, they each have unique attributes, I have two working and a third stub, and the right methods are being called based on their type, everything indicates that the dynamic module importing, the registration of the handler type and the class name to instantiate, is working.

Yet, if it’s a namespace issue, how could the Juicer ever even create the correct child class?

Lastly, with respect to the dynamic module import, let me clarify. The Broker class finds any modules in a sub-directory, imports the modules in turn as follows,

spec = importlib.util.spec_from_file_location(fmod_name, f)
rmod_obj = importlib.util.module_from_spec(spec)           
sysmodules[f] = rmod_obj                                   
spec.loader.exec_module(rmod_obj)

it then looks for the module level registration function, and calls it

rfunc = getattr(rmod_obj, DEF_DMS_REG_FUNC)                
rdata = rfunc()

which returns the classname of the module’s DataMngr class, which the broker then stores and associates with this data store type for when it’s called on to provide one.

I feel like this may all be technically correct, and it IS working, but I may have, what can we call it, a “tear” maybe where I didn’t stitch something together fully correctly after doing a module import.

I don’t know, that’s just my intuition talking. I’m at a loss. Now I’ve thought about maybe changing my approach a bit. Maybe letting these lower classes return their state another way, but these dynamically loaded classes are not known at runtime, and any such approach needs to be wary of increasing the coupling between these dynamically loaded classes and the main classes. And besides, I’m so damn curious why I broke python I want to understand this. Moreover, I should be able to rely on the exception mechanism, damn it! :slight_smile:

If you’ve made it this far and have any ideas, I’m at wits end and all ears. Any thoughts appreciated. Not only solutions but anything, any way to test, or experiment that gets me closer to figuring this out is much appreciated. At this point, I’ve got a pretty verbose trace log working, so I know what’s not happening, I just can’t see why!

TIA,
Jim/Houston

I suspect the issue is that your module importing code there doesn’t check sys.modules for the module first, allowing it to be executed twice. Then you’ll have duplicate copies of the class objects hanging around, meaning the except: check doesn’t detect the other class. Instead of manually executing the module, I’d recommend instead just using importlib.import_module(), which does all the importing logic. Just pass the dotted name and it returns the module object.

Thanks for the feedback.

I’ve not found any samples or examples online using importlib.import_module() but I’ll definitely look into it.

I built my approach following this page from stack overflow which, as you can see, delineates different approaches by python version, at least up to 3.5+. Like I said in my OP, I tried to use the lasted, recommended approach, but that was the latest I found.

Do you know the history of the availability of import_module()? Is this the latest or a legacy example? Still, if it handles correctly, it handles it correctly, right?

However, I’m not sure I feel like this is the culprit. First off, the dynamic loading of the child classes only happens once, and the only module and class that even has access to these modules is the (class factory) DataMngr_broker class. It only reviews the sub-directory where child class modules are located ONCE during it’s initialization, as it registers these data handlers, it only does the import for the module once, and the Juicer class creates only ONE DataMngr_broker as an internal member and does so during ITS initialization. So I don’t see this being about me loading it more than once, but that doesn’t mean there’s not something else at play I don’t fully understand that doesn’t still accomplish the same.

The biggest concern I have is that despite originally creating custom exception classes, initial attempts to figure this out resulted in my disconnecting the use of these custom classes and resorting to using generic exceptions just passing my custom error message. Now, I may be completely goofing this up as well, but where I throw the exception in the dynamically loaded module, I simply changed,

...
else:  #throw bad sig Error
   raise CCustomError(emsg_custom_error)

to

...
else:  #throw bad sig Error
   raise Exception(emsg_custom_error)

and yet back in my top-level class (Juicer),

try:
...
except (Exception) as err:
   #handle caught exception

still does nothing. Unless I’m misusing the generic Exception class directly, this does at least table any concerns for my exception classes for the moment, don’t you agree?

I’m still going to try a different import approach as it still could very well be that something hasn’t gotten stitched into place correctly with the approach I took. (Edit: Unfortunately, there is a bug in import_module() that causes it to choke on path and filenames that contain the ‘.’ period character. This I cannot abide. I’m not sure I understand the differences, however the approach I took is almost straight from the python docs.)

Thanks a bunch for the thoughts, however! Much appreciated.

Jim/Houston

I suspect the issue is that your module importing code there doesn’t check sys.modules for the module first, allowing it to be executed twice.

So, I always like an opportunity to look beyond the stuff I already know or use commonly, and this comment keep banging around in my skull, so thank you for that. I guess I knew that this sort of list had to be out there but never really had a need or occasion to look into it. As I did in my last post, just from the design and interaction of the classes, it is pretty straight-forward to show that these modules are not being loaded or executed more than once, but I still wanted to take a closer, more concrete look.

I did have what I thought was a small issue, that might have been a factor, and it did cause me to slightly change my importing block. When iterating thru sys.modules, the ‘key’ to the module dictionary, the base module name for the three added child DataMngr’s was being reported as a full path and name rather than just the core module name like every single other module. I adjusted my load code to fix this. Unfortunately, none of this had any effect on catching an exception raised in a descendent class back in the core. But it was enlightening. I was able to verify that the three descendent, child DataMngr modules are NOT present in sys.modules before my loading, they are present after and they are present only once, and all module names and paths are correct.

It was still a worthy journey forward. Thanks,

Jim/Houston

Just to add more information on this. In general, inside the Juicer class (review my janky diagram here), and indeed part of it’s own construction and initialization, my try block essentially looks like,

try:
   ...
   create (class factory) Datamngr_broker
   call (class factory) Datamngr_broker to create a child DataMngr (throw an exception)
   continue to finalize Juicer initialization:  init_juicer()

except (Exception) as err:
   print("EXCEPTION Caught!")
   raise (to let it propagate upwards)

Ok, just to review, this ALL WORKS when everything is correct. I’m purposely working out the exception framework, so I’m purposely asking the Juicer to use a bad file so that when the descendent/child Datamngr that handles that particular type of store tries to initialize and it properly determines there’s a problem, it throws an exception, stops processing and allows control back up the chain.

I wanted to make sure the exception mechanism was working properly back up at the Juicer level. So when the exception in the child Datamngr is NOT caught (the reason for my OP), and control returns to this try block inside Juicer, like I said in my OP, I can see in my logs, that it just continues on to finalize the Juicer initialization (init_juicer()) as if nothing is amiss and certainly doesn’t catch the exception.

As a test, a confirmation, an attempt to find sanity, whatever, I interjected a line inside init_juicer() to just short-circuit any further work, and just manually throw an exception,

raise Exception('DEBUG: EXCEPTION, Raising General Exception')

Just for the record, the try block in question, on the line right after the line that results in an exception that does NOT get caught, this exception gets caught as expected, message is printed once, and the exception is re-raised, moves up the chain, and again gets caught at the main module/script level and prints a message again. So, from the Juicer object upward, things are working as expected.

I then redid this test, again interjecting a short-circuiting, intentional exception as the Datamngr_broker is being created (the line BEFORE calling on the class factory to CREATE a child Datamngr) and just as expected, the exception is caught, re-raised, and sent up the chain.

So the Datamngr_broker (class factory) class itself, another class on this side of the dynamic module loading, and an internal MEMBER of the Juicer object, works as expected and exceptions are recognized and caught as expected.

But exceptions, even the BUILTIN general exception class, when thrown in the dynamically loaded modules are just not caught.

Jim/Houston

It might be easier to dodge a lot of the complexity here by avoiding the import system at all. Use imports for your normally-loaded modules, and if you need something different, bypass it and just use exec. Something like this:

import blah, blah, blah, blah
with open("some-file.py") as f:
    module = {}
    exec(f, module)

Inside the module, imports should behave completely normally, always and only fetching from the perfectly regular import system, using its cache correctly, etc, etc. This one specific file is then handled differently.

This creates a clear and simple hierarchy: the dynamically-loaded module (or modules, if you do more than one in this style) may depend on normal imports, but not the other way around.