Calling instance function from name stored in instance dictionary requires extra argument. Is there a workaround?

Left_Guard · November 10, 2022, 5:06pm

Attempting to validate data with a variety of labels used to varying degrees by various objects, I’ve been writing a class to do the validating, storing the function names in an instance dictionary, so that labels needing their values to get the same kind of validation can call the same validating function, however, calling the function via a lookup in the dictionary requires ‘self’ to be supplied as the first argument.
This can be overcome by decorating each function in the class with @staticmethod and removing the ‘self’ argument from it’s declaration, but I’m wondering if there’s a workaround that would simply enable the relevant function to be called as if I was calling it normally from within the instance?

This isn’t the code I’ve been writing, but is a shorter version to demonstrate the dynamic.
It uses a function to return the number of attributes required in each call, depending on the method used to call it, then calls each function via each method.

import inspect


class FunctionRunner:
    """Class to test running functions from a dictionary."""

    # The functions
    def _func1(self, value: str) -> None:
        print(f"Called method _func1 with value {value}.")

    def _func2(self, value: str) -> None:
        print(f"Called method _func2 with value {value}.")

    def _func3(self, value: str) -> None:
        print(f"Called method _func3 with value {value}.")

    def _func4(self, value: str) -> None:
        print(f"Called method _func4 with value {value}.")

    # The dictionary containing references to the functions
    _func_dictionary = {1: _func1, 2: _func2, 3: _func3, 4: _func4}

    # A function to look up a reference and run a function
    def run_function(self, func_id: int, value: str) -> None:
        # Looking for a workaround to enable me to remove 'self' from this function call
        self._func_dictionary[func_id](self, value)

    def run_directly(self) -> None:
        self._func1("10")
        self._func2("20")
        self._func3("30")
        self._func4("40")

    def count_positional_args_required(self, func):
        signature = inspect.signature(func)
        empty = inspect.Parameter.empty
        return sum(param.default is empty for param in signature.parameters.values())

    def inspect_functions(self):
        for idx in range(1, 5):
            print(self.count_positional_args_required(self._func_dictionary[idx]))
        print()
        print(self.count_positional_args_required(self._func1))
        print(self.count_positional_args_required(self._func2))
        print(self.count_positional_args_required(self._func3))
        print(self.count_positional_args_required(self._func4))


def main():
    FR = FunctionRunner()
    FR.inspect_functions()
    print()
    FR.run_function(1, "10")
    FR.run_function(2, "20")
    FR.run_function(3, "30")
    FR.run_function(4, "40")
    print()
    FR.run_directly()


if "__main__" == __name__:
    main()

Thanks in advance for any help, folks!

barry-scott · November 10, 2022, 5:30pm

You do not need to pass self it will be provided by python.

Rosuav · November 10, 2022, 5:42pm

This creates a dictionary that has no way of knowing which instance it applies to - it’s attached to the entire class and can only be a part of the class, not the instance. So there are three ways to approach this:

On __init__, build this dictionary using self._func1 etc, and attach it to self. This makes them all into bound methods for that particular instance.
Make them all class methods, giving them a reference to the class but not a specific instance.
Pull them out of the class altogether, if they don’t even need the class.

There’s no particular reason to have them in the class if they don’t need to know about the class/instance. On the other hand, if they DO need the instance, there’s no way around the extra parameter; the dictionary isn’t attached to self, so you have to identify it that way.

But depending on what your function IDs are, a neater option might be to make use of naming instead of a separate dictionary. You’ve made the ID 1 map to self._func1, and since you have a leading underscore on that, I’m going to assume that it’s private and the actual name is immaterial. Then, here’s a way to write run_function more simply:

def run_function(self, func_id, value):
    getattr(self, "_func%d" % func_id)(value)

Voila! No dictionary needed - other than the one the class already has. Binding to self happens automatically.

Left_Guard · November 10, 2022, 6:30pm

Thanks, @Rosuav.
What I’m trying to do is achieve option #3 for the various classes I’m needing to write, in order to remove the need for code duplication.
I’d done something like option #2 by using the @staticmethod decorator, but in the end I’ve gone for your option #1, which does the job perfectly. I’d just hoped there was a syntax that would enable me to attach the returned function to the instance, to make it a call to an instance function without having to use init, but never mind.
The actual dictionary of attributes was this one (before I put it in the init method and add ‘self.’ before each entry’s value), with the objects they’re associated with each not using all of them, but instead some combination, making it easier to simply include a list of permitted attributes in each class and then have them call a central class full of validation methods, to validate whichever they pass on to it via the validate function.

    _txc_attributes = {
        "BaselineVersion": _positive_integer,
        "ChangesSince": _datetime,
        "CodeSpace": _str,
        "CreationDateTime": _datetime,
        "DataRightRef": _str,
        "DataSource": _str,
        "delta": _bool,
        "FileName": _str,
        "GridType": _location_grid_type_enumeration,
        "id": _str,
        "layer": _positive_integer,
        "LocationSystem": _location_system_enumeration,
        "MappingSystem": _mapping_system_enumeration,
        "Modification": _modification_enumeration,
        "modification": _modification_enumeration,
        "ModificationDateTime": _datetime,
        "Precision": _precision_enumeration,
        "RegistrationDocument": _bool,
        "RevisionNumber": _positive_integer,
        "SchemaVersion": _float,
        "Sequence": _int,
        "SequenceNumber": _positive_integer,
        "Status": _status_enumeration,
        "xml:lang": _iana_subtag,
    }

    def validate(self, attribute: str, value: Any) -> Any:
        """Validate the offered value for the attribute."""
        if attribute in self._txc_attributes:
            return self._txc_attributes[attribute](self, value)

Left_Guard · November 10, 2022, 6:33pm

Running the code in my post without including ‘self’ as a parameter won’t work. I was hoping there was a syntax that would let me associate the function returned from the dictionary with the instance, without having to use init, but it looks like that’s what I’ll have to do.

barry-scott · November 10, 2022, 9:02pm

I write code like that all the time and it works great.
Here is an example that shows how it all works:

class CallMe:
    def __init__(self, name):
        self.name = name

    def getName(self):
        return self.name

fred = CallMe('fred')

fn = fred.getName

print('fn is %r' % (fn,))

print('My name is', fn())

Output:

% py3 instance.py
fn is <bound method CallMe.getName of <__main__.CallMe object at 0x1009a8ad0>>
My name is fred

steven.daprano · November 10, 2022, 12:06am

If your functions truly don’t rely on self at all, the easiest way to avoid needing to manually supply self is to just don’t add a self parameter when you define them.

class FunctionRunner:
    def _func1(value: str) -> None:
        ...

However that has three disadvantages:

You cannot call those functions directly from an instance: obj = FunctionRunner(); obj._func1('hello') will fail. (However calling it from the class FunctionRunner._func1('hello') will work.)
You will confuse most of your fellow coders, who will assume it is a bug and try to “fix it” by re-inserting the self parameter
And you will confuse linters and probably type-checkers.

An alternative is to move the functions out of the class, into the top level of the module. That will avoid them looking like methods to the reader and to code checkers.

Or, as you say, to decorate them as static methods.

But what if those functions actually are meant to be methods, and so should have the self parameter?

In that case, the least invasive change is to do nothing at all and just provide self when you call the function. This is fine, and its only one extra argument.

Another is to delay constructing the dispatch table until self is initialised:

class FunctionRunner:

    # Construct the dispatch table using proper methods, not functions.
    def __init__(self):
        self._func_dictionary = {1: self._func1, 2: self._func2}

    def _func1(self, value: str) -> None:
        ...

You have to delay this to the __init__ method as before that, the instance self doesn’t exist!

Or, you can rethink your dispatch mechanism. What you seem to be doing is a version of dynamic dispatch, which is usually done with a naming convention and getattr. A typical example might be something like this:

class FunctionRunner:

    # Use an explicit naming scheme. Methods prefixed with "_do" are
    # reserved for use by run_function.
    def _do_func1(self, value: str) -> None:
        ...

    def run_function(self, func_id: int, value: str) -> None:
        method = getattr(self, '_do_func' + str(func_id), None)
        if method is None:
            raise ValueError("invalid function key")
        method(value)

There are lots of variants on this. For instance, you could leave out the None parameter and just allow getattr to raise AttributeError. Or you could use a magic __getattr__ dunder method to dynamically look up you private implementation methods from a public name:

# I leave you to work out the rest of the details
def __getattr__(self, name):
    if name.startswith("function"):
        func_id = name[8:]
        return getattr(self, '_do_func' + func_id)

# This allows calls like obj.function45() to be dispatched to method _do_func45

Left_Guard · November 11, 2022, 11:32am

That’s a fantastic breakdown of everything, @steven.daprano. Thanks for that.
What I’ve done is do like the second code excerpt you included, putting the lookup dictionary into the class init method like so.

    def __init__(self) -> None:
        self._txc_attributes = {
            "BaselineVersion": self._positive_integer,
            "ChangesSince": self._datetime,
            "CodeSpace": self._str,
            "CreationDateTime": self._datetime,
            "DataRightRef": self._str,
            "DataSource": self._str,
            "delta": self._bool,
            "FileName": self._str,
            "GridType": self._location_grid_type_enumeration,
            "id": self._str,
            "layer": self._positive_integer,
            "LocationSystem": self._location_system_enumeration,
            "MappingSystem": self._mapping_system_enumeration,
            "Modification": self._modification_enumeration,
            "modification": self._modification_enumeration,
            "ModificationDateTime": self._datetime,
            "Precision": self._precision_enumeration,
            "RegistrationDocument": self._bool,
            "RevisionNumber": self._positive_integer,
            "SchemaVersion": self._float,
            "Sequence": self._int,
            "SequenceNumber": self._positive_integer,
            "Status": self._status_enumeration,
            "xml:lang": self._iana_subtag,
        }

The multitude of different combinations used by the XML data that I’m checking means I don’t want to have to code the checking into each different object type, instead having them just ask a single, omniscient class “Before I pass this on to be stored in the database, is it a valid value?”. Coding into each type of object a list of valid attribute names that it should pass on, while rejecting any erroneous ones then makes things easier for me, as each object knows the names of any attributes it could have, while the validator class knows what each names value should conform to, meaning multiple classes using the same attribute name can call one function in one place, instead of me having to put validation code into every class.
I hadn’t considered using a naming convention and getattr, but now I think about it, each function is just returning a boolean to indicate if it’s a valid value or not, so prefixing each function with ‘_is’ would have let me do it that way. I just thought the code

    def validate(self, attribute: str, value: Any) -> Any:
        """Validate the offered value for the attribute."""
        if attribute in self._txc_attributes:
            return self._txc_attributes[attribute](value)
        return False

in the validate function would be the clearest way of indicating the code’s methodology, because it most resembles a simple function call.