Hi Folks, First Time on the forums but I’ve got my first issue in four years I just cannot see past and that’s probably because I’m doing something that’s probably less common and using some esoteric services I don’t fully understand.
Quick background (mainly so you can measure our communication), it’s an easy skip if you’d rather, I’ll keep this short and in one paragraph only. I’ve been in the software industry since college in the early 90’s. Honestly I never really saw python as a serious language, being an old C/C++/Pascal programmer. I started working contract after college, Houston has lots of opportunity for small tech solution providers for big firms and industries, oil and gas, aerospace, etc. I got into oil and gas. But four years ago two of my kids went off to military academies and they started using python. I took that time to learn the language to be able to follow along, discuss and offer thoughts to my girls if they needed it. And I must admit I did fall in love. Mostly because it’s fast and easy to prototype and or get a script working just to get something done. For example, I stopped even trying to learn the clusterfu%(*k that is Microsoft’s power shell, python fits the bill for a desktop scripting language to empower users to write for their own immediate needs. I’ve got tons of custom scripts now to do things I want done on my desk, rather than .bat or .ps1 scripts!
Lastly, I don’t know everything, and I don’t know everything about python for sure, but you need to know that I wasn’t born yesterday either, so if I don’t specify everything exactly, unless it’s a critical part of the problem, you should probably not start by assuming I missed it. I’m not going to post whole chunks of code, in fact, if I post any it will probably be analogous or pseudo-code. I don’t usually need actual code help, where I tend to need help is either way more technical or principled. Also, this is beyond just a short code snippet or two, and I don’t really expect anyone to spend that amount of time knee-deep in my code, but my spidey sense tells me I’m bending something fundamental that I don’t fully understand about python, so mostly this is probably going to be a discussion on principles and such. And many TIA if you decide to follow along.
Now don’t kill me, it’s been a long time since I’ve actually used UML and since this issue may involve not only the python importing framework as well as class interaction I did my best to hack together something UML like that will help visualize my entities. You can see that diagram here, lj_diagram
Description of the project:
I’ve got a main console script, that handles the command-line, switches, etc., and creates an instance of a Core class, let’s call it Juicer. Juicer needs to interface with some data stores of different types and does so thru a private object we’ll call a DataMngr. But instead of the Juicer knowing beforehand what DataMngr’s are available and which one to create to interface with, I’m using a class factory pattern, call it the DataMngr_broker, that figures out what data manager to create and give to the Juicer. I wanted DataMngrs for specific data stores to be able to be created later so I’m using importlib to dynamically load any modules found in a sub directory, call a registration function, and register the type of store and the name of the class that should be instantiated for that type. So DataMngr’s are inherited from an abstract base class and since the Juicer cannot know what DataMngrs are available, it only imports the ABC module. My spidey sense tells me it has something to do with the technical side of dynamically importing these modules that is causing my problem, but I’ll get to the problem in a moment.
It’s important to know that this works! In fact, it works great. I can place new modules, that define new DataMngr child classes from the ABC, drop them into the sub directory, and the Juicer can assess and interact with that data store perfectly. I researched how to use importlib, I’m on python 3.11, and with python changing so fast, and so much legacy and older examples and snippets scattered across the interwebs, I tried hard to use importlib with the latest recommendations.
The Problem:
So, once this framework was up and working, and it might be significant to point out that originally before creating this dynamic loading, I was working with one specific data store type first to get everything working so most of my exception framework was already in place and working fine. But, with the new dynamic loading/importing, I was doing some cleanup and refactoring on some of my exception classes, you know, consistent error messages, at what level is this sort of exception handled, etc. AND testing the exception handling of the DataMngrs, when I noticed that if the Juicer tried to use a store that the DataMngr could not initialize correctly for, say a corrupt file, or a bad signature, this exception wasn’t propagating and being caught by the try block in the juicer where the DataMngr_broker is created and then is called on to create the correct DataMngr child class.
This is unequivocal. I can see in my logs, precisely where I’m raising this exception, and yet back at the Juicer, rather than the exception being caught, process moves to the next line after this line that results in the exception further down – the creation and return of the child DataMngr. At which point, because the DataMngr encountered a bad file, I get general python exceptions and a traceback for other errors because of this, none of which is helpful as this is all downstream of where the exception is not caught.
What’s interesting is that the exception at least does start being raised. This is happening as the child DataMngr class is being created and initialized, and processing absolutely stops after raising an exception, as there are many other things that still need doing at this point but it is abruptly interrupted and control moves up the stack, thru the broker and back to the Juicer and the try block in question, and control moves on to the next line, but at this point, I don’t have a reliable DataMngr. Although, the class IS created and valid in that sense, internally it’s members are not. For example, after validating the data store, it opens and maps the file. Of course, none of this, and none of the child DataMngr’s internal members that manage this are valid. But, I can confirm, at least, that at this dynamically loaded module and the class created from it, the exception chain is started.
At first I thought this was a namespace issue. The DataMngr exceptions are defined in the ABC module. So, while Juicer needs to import the ABC module to know how to talk to DataMngr’s in a general sense (polymorphism), so do the modules that define the child classes, because they need the ABC class declaration, AND they need the exceptions to throw for all DataMngrs.
But I short-circuited my custom exception classes and just tried raising a general purpose Exception, of course I provide my own error message strings. But to no avail, raising a general Exception in the class created from the dynamically loaded module, does start, and the class initialization is stopped and control returns to the Juicer class, again exactly as described, but the exception is still not caught.
So that knocks out a namespace issue with respect to the custom exception classes, but it still feels like I’ve not connected something completely right in doing the dynamic module import. However, rather than just start assuming that must be it, we cannot forget, if no exceptions get thrown, the dynamically loading modules DO in fact WORK.
Each data store type has it’s own child DataMngr inherited from the ABC, they each have unique attributes, I have two working and a third stub, and the right methods are being called based on their type, everything indicates that the dynamic module importing, the registration of the handler type and the class name to instantiate, is working.
Yet, if it’s a namespace issue, how could the Juicer ever even create the correct child class?
Lastly, with respect to the dynamic module import, let me clarify. The Broker class finds any modules in a sub-directory, imports the modules in turn as follows,
spec = importlib.util.spec_from_file_location(fmod_name, f) rmod_obj = importlib.util.module_from_spec(spec) sysmodules[f] = rmod_obj spec.loader.exec_module(rmod_obj)
it then looks for the module level registration function, and calls it
rfunc = getattr(rmod_obj, DEF_DMS_REG_FUNC) rdata = rfunc()
which returns the classname of the module’s DataMngr class, which the broker then stores and associates with this data store type for when it’s called on to provide one.
I feel like this may all be technically correct, and it IS working, but I may have, what can we call it, a “tear” maybe where I didn’t stitch something together fully correctly after doing a module import.
I don’t know, that’s just my intuition talking. I’m at a loss. Now I’ve thought about maybe changing my approach a bit. Maybe letting these lower classes return their state another way, but these dynamically loaded classes are not known at runtime, and any such approach needs to be wary of increasing the coupling between these dynamically loaded classes and the main classes. And besides, I’m so damn curious why I broke python I want to understand this. Moreover, I should be able to rely on the exception mechanism, damn it!
If you’ve made it this far and have any ideas, I’m at wits end and all ears. Any thoughts appreciated. Not only solutions but anything, any way to test, or experiment that gets me closer to figuring this out is much appreciated. At this point, I’ve got a pretty verbose trace log working, so I know what’s not happening, I just can’t see why!
TIA,
Jim/Houston