What is the best way to access an object created in module a inside a function imported from module b?

Slurms_MacKenzie · September 12, 2023, 2:29pm

Hello all,

A bit of a funky one. Imagine I have a module b, in which a function func is defined that references a variable var not defined in b. I then import func into a module a - where var is defined - and run it. The interpreter complains that var is not defined.

I’ve seen it suggested elsewhere that this would be poor practice anyway, as it means b relies on the existence of something external, and a better way would be to pass var into func when it’s called from a

One problem with that approach, as far as I see it, is that if func contains other nested function calls that also need access to the content of var, it has to be written in such a way that it passes it through to those as well, and they need to pass it into any nested calls they might contain (should there be a need).

This makes b’s code more confusing than it needs to be, as one has to constantly follow var as it’s passed through. It would be simpler if var could be defined in a and then anything called from within a could read it’s value without requiring it to be passed in.

My programming knowledge is still pretty shallow, but I’m guessing it’s something to do with the fact that there is a separate name-space for each module, so when a function defined in b is called from a, names are still resolved using b’s name-space - not a’s (feel free to correct if this isn’t the case).

The obvious answer is to define func in a and do away with b altogether, but for my purpose a isn’t really another module but an interactive interpreter session that I’m using to test b, and I can’t really define var in b as var’s contents are created at runtime.

a couple of examples…

content of b.py

def func():
    print(var)

content of a.py

from b import func

var = 'foo'
func()

*Interpreter complains

It can be achieved as follows but I think it makes b.py kind of messy…

content of b.py

def funcOne(arg):
    funcTwo(arg)

def funcTwo(arg):
    funcThree(arg)

def funcThree(arg):
    print(arg)

content of a.py

from b import func
var = 'foo'
funcOne(foo)

Is this just how a module has to be written if it’s going to have it’s contents imported and used elsewhere?

jamestwebber · September 12, 2023, 2:47pm

Your second example is way more complicated than the first version, for no clear reason. Presumably you’re doing a lot more than just printing the value, but it’s hard to make suggestions.

I think many people would say the opposite: this makes the code far more readable, because using global variables quickly gets confusing and hard to follow. You are experiencing this right now as you try to get it to work the way you expected

Seeing the inputs to a function in the signature makes the code much easier to understand in isolation, and it’s more reusable.

Slurms_MacKenzie · September 12, 2023, 3:22pm

It’s more of an analogy of what the actual code does, but I thought it better to make it more straightforward with a simple print statement (obviously the actual thing doesn’t contain an elaborate series of nested functions for no reason )

I guess you’re right, personally I think it makes the code look more confusing when it’s essentially puffed up by a variable that gets passed around a lot. Almost as if the ‘volume’ of code has increased but it’s not doing anything more than if a global variable were used. But on the other hand I suppose as the code gets more complex, it’s easier to come back to it after a while, or for someone other than it’s author to pick it up, if it’s done more explicitly like in the second example.

jamestwebber · September 12, 2023, 3:31pm

Is the value of var changing over time, or is it a constant that you want to give a name so you’re not rewriting it all the time?

If it’s changing over time, passing the value into the function is far easier to maintain, and will avoid a lot of tricky bugs. Otherwise your functions have a hard-to-see dependency on the order in which they are run, and moving things around can break things in a way that is hard to track.

If it’s a constant, it’s more reasonable to define it at the top level. It’s common to give such values an uppercase name like VAR to signal that they are constants. But it’ll still be at the scope of a single module, and only available if you import the module and use a.VAR to refer to it ^[1].

But that solution doesn’t help you here, if you are trying to define var in the interpreter and then execute func.

in your example you’d have to deal with a circular import, but that’s fixable ↩︎

kknechtel · September 13, 2023, 9:49am

In general, you should pass functions the information they need, and get information back via what they return, yes.

But to answer as asked (and please don’t do this):

Each module’s code has its own global namespace specifically for that code. Those globals are reflected as the attributes of the module object - i.e, what you get using the . notation.

Modules are allowed to import each other. The problems occur when top-level code is in a dependency loop, or when you try to use the from ... import syntax to import attributes directly. The idea is that everything at top level in a .py file is executable code, that is executed top to bottom. That includes things like def statements (when they run, it creates the function object) and class statements (when they run, it creates the class)… and import statements (when import x runs, it looks for a cached module object, or else starts loading that module, and then sets the name x to the result).

A loop between two import statements works fine, because of how the loading process works (i.e. when there is not already something cached): first, an “empty” module object is stored in sys.modules, and then the code is executed, using the attributes of that object as the global namespace. So, as long as the importing code doesn’t care about the fact that the attributes haven’t been set yet, there is no problem.

This means we can do:

a.py

import b

var = 'foo'
def func():
    b.func()

b.py

import a

def func():
    print(a.var)

Creating each function doesn’t actually require looking up any attributes - it only requires generating the code that will do so, when the function is called later. So in b’s global namespace, a will mean the a module. When b is being loaded, nothing from a is used yet; but later, the func can look up a.var. Similarly, in a’s global namespace, b will mean the b module. While a is being loaded, nothing from b is used yet; but when a’s func is called, it can look up b’s func.

Slurms_MacKenzie · September 13, 2023, 1:44pm

The value changes. I’ll post the actual code below to make the purpose of what I’m trying to do clearer, but b is basically just a module that contains a class and a function that can be used to create nodes of a tree-like structure, and add nodes to the tree, respectively. That’s why my a is an interactive interpreter, so I can just play around, modify b a bit, restart the interpreter and see how she goes, etc.

I’d just like to add that I’m not proud of this code and it’s really just something I was playing around with. I’m sure there’s a less elaborate way to achieve something similar.

b.py

class node():
    def __init__(self, value, parent=None, lchild=None, rchild=None):
        self._value = value
        self._parent = parent
        self._lchild = lchild
        self._rchild = rchild

    def setLchild(self, value, nodes):
        n = node(value, nodes.index(self))
        nodes.append(n)
        self._lchild = len(nodes) - 1

    def setRchild(self, value, nodes):
        n = node(value, nodes.index(self))
        nodes.append(n)
        self._rchild = len(nodes) - 1

    def getLchild(self, nodes):
        if self._lchild == None:
            return False
        return nodes[self._lchild]

    def getRchild(self, nodes):
        if self._rchild == None:
            return False
        return nodes[self._rchild]

    def getParent(self, nodes):
        if self._parent == None:
            return False
        return nodes[self._parent]

    def getValue(self):
        return self._value

def insert(x, root, nodes):
    if x < root.getValue():
        if root.getLchild(nodes):
            insert(x, root.getLchild(nodes), nodes)
        else:
            root.setLchild(x, nodes)
    elif x > root.getValue():
        if root.getRchild(nodes):
            insert(x, root.getRchild(nodes), nodes)
        else:
            root.setRchild(x, nodes)

the node objects track their left and right child nodes, and their parent node. But, they do this by storing the index at which each one appears within a list of all the nodes. Each time a new node is created, the node is added to the list, and the attributes of itself and it’s parent that record their relationship are set to the appropriate index. (hope that makes sense)

wherever you see the variable nodes, this is just the list of all nodes, that the user is expected to pass in, being handed around. Because it’s a fundamental component of the whole thing I was trying to have it be a global

jamestwebber · September 13, 2023, 1:54pm

A point in favor of passing it in: it makes it possible to play around with two different trees at the same time, and compare the results. If it’s a global you’re stuck with one tree.

Another way to achieve this would be to just have nodes be an attribute of your class. Then all of your methods could access self.nodes.

Slurms_MacKenzie · September 13, 2023, 2:01pm

Thanks, this is really handy info. One question I do have:

When you use the term ‘loaded’, what does this mean? Is it when the .py file is being parsed and the function objects, etc. are being created in memory? If so I understand what you are saying is that a function containing a reference to an as-of-yet unavailable resource can have it’s object created without problem, as that resource isn’t actually accessed when the def statement is executed, only when the function itself is called.

Which I guess is why function definitions that contain calls to other functions and variables that are defined lower down in the .py can be created without issue?

Slurms_MacKenzie · September 13, 2023, 2:09pm

A good point!

This. I think this is the right way to do it. It would avoid having a key list floating around that’s not really coupled to the class in any way. Thanks for your help and ideas

kknechtel · September 13, 2023, 9:22pm

Yes, you understand it quite well. It’s the same reason that only SyntaxError ever gets raised “ahead of time”; even ImportError requires actually running the top-level code in the file (which happens automatically and immediately).