Overhead vs repetition

osxtra · March 4, 2023, 2:45pm

Hi, all,

I have a ‘pad’ method which returns a justified string, with rjust being the default. Either a ‘match’ or multi-if statement can be used to test for the desired justification, and I’m not sure if there should be three tests: ‘c’; ‘l’; ‘_’ / ‘else’; or, four: ‘r’; ‘c’; ‘l’; ‘_’ / ‘else’.

The concern is overhead (granted, tiny, but potentially present). Since rjust is the default, with the ‘3’ test match we’re almost always making more comparisons before settling on the pad direction. With the ‘4’ test match, we’re repeating the ‘r’ code.

Though repetitive code is not explicitly discussed in PEP 8, the ‘3’ test match looks to be more ‘Pythonic’.

Performance-wise, though, the ‘4’ test match might be better to use.

Thoughts?

barry-scott · March 4, 2023, 8:41pm

Please show the two code example you are considering.

Have can use timeit — Measure execution time of small code snippets — Python 3.11.2 documentation to see how the alternatives perform.

Remember that premature optimisation is often a waste of your time.
Its usually best to write an obvious and easy to maintain function then to worry about performance. You may never be in the situation where the performance matters.
And use data driven performance analysis to tell you where the hot spots are in any bofy of code.
Intuituion on performance is usually wrong experience shows.

osxtra · March 5, 2023, 3:49pm

Sure thing. Again, the overhead in this particular case would be small; other, more complex cases could be more ‘expensive’. Thanks for the reminder though of time testing.

Here’s the portion of the ‘pad’ method that determines what justification to employ:

""" PEP 636 style: 4 choices repeating code to have the (presumably) most-used default choice first """
match direction:
    case i if i == 'r':
        return val.rjust(amt, fill)
    case i if i == 'c':
        return val.center(amt, fill)
    case i if i == 'l':
        return val.ljust(amt, fill)
    case _:
        return val.rjust(amt, fill)


""" PEP 636 style: 3 choices with default last but first two almost always being unneccesarily iterated """
match direction:
    case i if i == 'c':
        return val.center(amt, fill)
    case i if i == 'l':
        return val.ljust(amt, fill)
    case _:
        return val.rjust(amt, fill)


""" Traditional 'if' style: 4 choices repeating code to have the (presumably) most-used default choice first """
if i == 'r':
    return val.rjust(amt, fill)
elif i == 'c':
    return val.center(amt, fill)
elif i == 'l':
    return val.ljust(amt, fill)
else:
    return val.rjust(amt, fill)


""" Traditional 'if' style: 3 choices with default last but first two almost always being unneccesarily iterated """
if i == 'c':
    return val.center(amt, fill)
elif i == 'l':
    return val.ljust(amt, fill)
else:
    return val.rjust(amt, fill)

steven.daprano · March 5, 2023, 4:40pm

What happens if the caller passes, say, ‘Z’ or ‘hovercraft’ as the i argument?

(By the way, i is a terrible name for this argument.)

I would write something like this:

def pad1(value, amount, fill=' ', pos='r'):
    if pos == 'r':
        return value.rjust(amount, fill)
    elif pos == 'l':
        return value.ljust(amount, fill)
    elif pos == 'c':
        return value.center(amount, fill)
    else:
        raise ValueError('invalid pos')

Actually, no I wouldn’t. I would use a dispatch table and write it like this:

def pad2(value, amount, fill=' ', pos='r'):
    try:
        func = {'r': str.rjust, 'c': str.center, 'l': str.ljust}[pos]
    except KeyError:
        raise ValueError('invalid pos') from None
    return func(value, amount, fill)

If I wanted to try the new match case statement:

def pad3(value, amount, fill=' ', pos='r'):
    match pos:
        case 'r':
            return value.rjust(amount, fill)
        case 'l':
            return value.ljust(amount, fill)
        case 'c':
            return value.center(amount, fill)
        case _:
            raise ValueError('invalid pos')

Performance-wise, the dispatch table is likely to be the fastest, and it won’t matter what order you put the entries. My wild guess is that the old fashioned if…elif version will be second fastest, but it is quite possible that someday in the future the match…case version will be as fast as the dispatch table.

But in this specific case it is unlikely that the three versions will be that different.

steven.daprano · March 5, 2023, 5:01pm

That’s for sure! I decided to test my three versions of the pad function, and the one which I thought would be the fastest was twice as slow as the others.

The if..elif and match...case were equally fast on my machine, and the dictionary dispatch table twice as slow.

So I did a fourth version, using the same dictionary dispatch table, but moving the definition of the table outside of the function so it is only done once instead of every time the function is called.

DISPATCH = {'r': str.rjust, 'c': str.center, 'l': str.ljust}
def pad4(value, amount, fill=' ', pos='r'):
    try:
        func = DISPATCH[pos]
    except KeyError:
        raise ValueError('invalid pos') from None
    return func(value, amount, fill)

That did the trick, speeding it up to the same speed as the other two.

osxtra · March 7, 2023, 12:32am

Well, you’re right, having the default rjust at the beginning (‘4-test’ match / if) wasn’t any appreciably different from the ‘3-test’ version.

You’re also right that ‘i’ wa a horrible var for readability purposes.

To your point about passing ‘Z’ or ‘hovercraft’, what I showed before was only the match portion to determine justification. Here’s the actual method. It’s in a general purpose library which has some other helper methods defined.

(BTW, the ‘if’ version was a little faster than ‘match’, but not by much. Using the dispatch decorator was a hair slower, even when the dict was defined outside the method)

def pad(self, val = None, **kwargs):
    """ 
        Pad a passed value 
        Optional args:
            a: Amount to pad (def len of value)
            d: Direction (def 'r')
                c -> center
                l -> left
                r -> right
            f: Fill character (def space)
    """
    if val:

        # args is a helper function which returns default or passed values as needed
        args = o.args({'a': len(str(val)), 'd': 'r', 'f': ' '}, **kwargs)
       
        # n is a helper function that coerces a value into int or float, if possible, otherwise returns numeric zero.
        num = self.n(args['a']) 

        # s is a helper function to force a string representation of a var.
        fill = self.s(args['f'])

        min = len(str(val))
        amt = min if num < min else num

        # type is a helper function which can return extended info about var types
        fill = ' ' if not args['f'] or o.type(args['f']) != 'str' else args['f']
        match args['d']:
            case just if just == 'c':
                return val.center(amt, fill)
            case just if just == 'l':
                return val.ljust(amt, fill)
            case _:
                return val.rjust(amt, fill)