Proposed additions to `operator`

dg-pb · March 29, 2024, 5:07pm

a) There are operator.is_ and operator.is_in. I propose also adding:

def in_(a, b):
    return a in b

def not_in(a, b):
    return a not in b

b) There are operator.call and it’s object operator.caller. There is operator.methodcaller. I propose adding:

def callmethod(obj, name, /, *args, **kwds):
    return getattr(obj, name)(*args, **kwds)

c) To complete functionals I also propose adding:

def map_(obj, func, /, *args, **kwds):
    return func(obj, *args, **kwds)

class mapper:
   ...

d) And finally, not sure if this could be a place for it, but I found it useful. To add to dictionary operations. Common operation, but not operator:

def get(obj, idx, default=None):
    try:
        return obj[idx]
    except (KeyError, IndexError):
        return default

MegaIng · March 29, 2024, 5:37pm

a) You are aware of operator.contains, right? The reverse operation better mirrors the data model of python (the __contains__ method), which is one of the reasons for this entire module to exists in the first place. Why is it necessary, or even all that helpful, to add in_ and not_in?

b) operator.caller doesn’t exists. What are the potential use-cases for callmethod you are seeing? it seems quite hard to use with regard to use with map or a similar interface in contrast to other available functions

c) Having map and map_ both existing is IMO a non-starter. I am not really sure what your goal with this operation is, but it should have a different name. This seems to just be an alias for call with a slightly different (and more confusing) argument order? And mapper is just a weaker version of functools.partial?

d) IMO, this falls under “not every 3 line function needs to be in the stdlib”. While much of operator is just a 3-line function, part of the point is that they directly reveal underlying operations that CPython implements on a deeper level (for example, with a single opcode and/or operation). This isn’t one of them. It might be better in functools or maybe collections.

dg-pb · March 29, 2024, 5:59pm

I came from a bit different perspective here.

I am making proxy objects, that defer evaluation. So I make various operations first and evaluate the whole structure later.

Thus, I have a node node and I want to be able to define all possible operations with it.

Now contains doesn’t work, because if I have node in obj, it will call containment operator of obj, but in_ could be used to signify the deference of the operation until the node evaluates.

Yeah, it doesn’t… But it could.

In my case, the use case for callmethod is the same as in a)

I named it apply at first, but then named it back to map. Does functools.partial allow for positional argument skipping? I did introduce sentinel VOID in my own partial implementation to allow for this. Nevertheless, I did write mapper to perform this for a node in a more lightweight manner.

Nah, yeah, as I said, this was a very weak proposal.

All in all, as I said, my angle is a bit different I guess.

It is more along the lines of “emulating operator behaviour from perspective of the object”. Or in other words “… from the perspective of a first term of the operation”

While it seems that current rationale of operator module (or the POV from which many of your arguments come from) is more like “emulating operator behaviour from perspective of the operator”

bschubert · March 29, 2024, 6:00pm

Just as a datapoint, a quick GitHub search for lambda a, b: a in b and operator.contains(b, a) does bring up a few cases where an operator.in_ was missed. Some examples:

Jinja2

github.com

pallets/jinja/blob/3fd91e4d11bdd131d8c12805177dbe74d85e7b82/src/jinja2/nodes.py#L41-L44


      
          "lt": operator.lt,
          "lteq": operator.le,
          "in": lambda a, b: a in b,
          "notin": lambda a, b: a not in b,

asteroid

github.com

pylint-dev/astroid/blob/465780a9e3c27455d6f48c7e0b0a6d1686b68b7d/astroid/nodes/node_classes.py#L1792-L1795


      
          ">": operator.gt,
          ">=": operator.ge,
          "in": lambda a, b: a in b,
          "not in": lambda a, b: a not in b,

numba

github.com

numba/numba/blob/89218bb91d8ded3b573aba50d6d0967c1e25ff33/numba/cpython/builtins.py#L414-L415


      
          def in_impl(a, b):
              return operator.contains(b, a)

pyanalyze

github.com

quora/pyanalyze/blob/52031c4aa09fdfc19436d7d4a5b49c93de6d2fcc/pyanalyze/name_check_visitor.py#L307-L312


      
          def _in(a: object, b: Container[object]) -> bool:
              return operator.contains(b, a)
          
          
          def _not_in(a: object, b: Container[object]) -> bool:
              return not operator.contains(b, a)

MegaIng · March 29, 2024, 6:17pm

Not sure what you mean? The magic method you need to implement for node in obj is __contains__. So it is actually impossible to overload this without a change in syntax, or modifying obj as well.

The point is you said it does. So you appear to not have paid attention to what is available. I am also not sure what operator.caller would do.

No, but that is a completely different topic. I don’t know how this relates to your proposal for map or mapper.

Well, yes, that’s why the module is called operator While I agree that this philosophical view could be changed, I still see no real benefit for suggestions b-d

Hm, these example do suggest to me that operator.in_ and operator.not_in are probably a good idea. Trying to generically do stuff like map AST nodes to actions is made slightly easier by this. But I am sure this has been discussed before, maybe it makes sense to search through the history of the operator module.

dg-pb · March 29, 2024, 8:12pm

You’re right. I can not emulate behaviour of the operator, but I use operator mixin (similar to what dask did dask/dask/utils.py at b663dca0fa4ca4686b8c08f7cb30d11320012901 · dask/dask · GitHub), where I automatically create operators and methods for such objects (proxies, nodes, array types, etc). So having in, not_in, I can make specifications with method names. So my object will not emulate operator behaviour, but the best of what I can have is a method created for a node object, so I can emulate it via node.is_in(obj).

I did not deny that I said it and implicitly agreed with your observation that I made a mistake. Whether I did or did not pay the attention is subject to investigation. In this case, I have my own caller and just forgot that I use my own implementation and not the operator. My caller implementation looks like:

class caller:
    __slots__ = ('_args', '_kwds')

    def __init__(self, *args, **kwds):
        self._args = args
        self._kwds = kwds

    def __call__(self, obj):
        return obj(*self._args, **self._kwds)

    def __repr__(self):
        args = list(map(repr, self._args))
        args.extend('%s=%r' % (k, v) for k, v in self._kwds.items())
        return '%s.%s(%s)' % (self.__class__.__module__,
                              self.__class__.__name__,
                              ', '.join(args))

    def __reduce__(self):
        if not self._kwds:
            return self.__class__, self._args
        from functools import partial
        return partial(self.__class__, **self._kwds), self._args

Also, these are equivalent:
a) methodcaller('name', *args, **kwds)(obj)
b) caller(*args, **kwds)(attrgetter('name')(obj)).

So one way to look at is A + B = C, where A = attrgetter, B = caller, C = methodcaller. Now A and C exist in operator. If subtraction C - A was possible, then there would be a very weak case to implement B, however it is not, so having caller could be useful for functional completeness.

This is in response to:

So my point is that it is not a weaker version of functools.partial, because functools.partial can not do what mapper can.

All in all, these are just proposals of things that I personally had to implement myself, while trying to achieve a complete functionality of object abstractions (Another example would be an array class which calls operators and methods over every element of it).

And yes, intuitively I felt that in_ and not_in could be most useful outside of what I do.

dg-pb · June 10, 2024, 4:18pm

There is one more suggestion I would like to add: operator.itemsetter

Item getter has become a very useful performance boost for many problems. I think its friend itemsetter could be as useful.

chepner · June 10, 2024, 6:40pm

itemsetter is more complicated from an API standpoint than itemgetter.
There are multiple ways itemsetter could be implemented for one attribute, all equivalent to setattr(x, 'foo', v):

itemsetter('foo', v)(x)
itemsetter('foo')(v)(x)
itemsetter(v, 'foo')(x)
itemsetter(v)('foo')(x)

It’s not clear how, or if, you would want to support setting multiple attribute at once. If you are serious about such a proposal, you’ll have to be explicit about what exactly you are proposing, because it’s not obvious what “the” counterpart to itemgetter would be.

dg-pb · June 10, 2024, 7:08pm

Thanks for reply and good points.

I am not much interested in one-arg variant. One arg variant (of any kind) will be easy to construct with gh-119127: functools.partial placeholders by dg-pb · Pull Request #119827 · python/cpython · GitHub
E.g.:

setter_a = partial(operator.setitem, Placeholder, 'a')
setter_to_1 = partial(operator.setitem, Placeholder, Placeholder 1)
setter_a_to_1 = partial(operator.setitem, Placeholder, 'a', 1)

Setting multiple items is what I have in mind. And the most convenient variant I believe would be number 2 in your list:

setter_abc = itemsetter('a', 'b', 'c')
setter_abc_123 = setter_abc(1, 2, 3)
d = dict()
setter_abc_123(d)
print(d)    # {'a': 1, 'b': 2, 'c': 3}

However, I see that it is more complex than itemgetter. It would require 1 intermediate class.

dg-pb · June 10, 2024, 7:31pm

Also, if anyone else has any thoughts on what could be useful to add to operator, this is a good place to add to the list.

So far I have:
a) operator.in_ & operator.not_in
b) operator.itemsetter - maybe

Also, thinking about possibility of something similar to operator.itemgetter(*idxs)(obj, *defaults).
This could be a non-invasive path to implement Indexable get method. [1,2,3].get(4) # None - #116 by dg-pb

ilotoki0804 · June 11, 2024, 1:18pm

Just to clarify, the actual implementation of operator.contains is exactly the same as the implementation of the proposed in_ function.

github.com

python/cpython/blob/e123f74513151da53bc380bf1ee03c35cec4f4c0/Lib/operator.py#L153C1-L155C18


      
          def contains(a, b):
              "Same as b in a (note reversed operands)."
              return b in a

dg-pb · June 11, 2024, 3:35pm

Not “exactly”. “note reversed operands”

storchaka · June 12, 2024, 7:43am

Note that some of these examples use lambdas even if there is corresponding operator function, e.g lambda a, b: a < b instead of operator.lt.

You can also find examples for other reversions, e.g. lambda a, b: b - a. Are you going to add functions with the reversed order for all operator functions or only for operator.contains?

dg-pb · June 12, 2024, 3:39pm

All the other operators:

a <op> b

class A:
    def __opname__(self_a, b):
        ...

Containment is a special case:

a in b

class A:
    def __contains__(self_b, a):
        pass

I don’t think reversed order can be justified in the same way for other functions.

I am not suggesting to implement contains in reverse order, but in and not in in natural order. Same as is_ and is_not.

To me these have proven to be useful in practice. I had to implement them when writing proxy objects.

storchaka · June 12, 2024, 4:29pm

It is more complicated, because they can call not only __op__, but __rop__. operator.contains() can also call __iter__ or __getitem__. operator.is_() does not call any special method.

Do you have concrete use cases for this or want to add it “for consistency”?

dg-pb · June 12, 2024, 5:10pm

In this case it is 50-50. If there was no consistency improvement I would not propose this, but I found that such consistency was required to make functionality of deferred evaluation graph objects complete in functionality.

One case that I needed this was custom object validators:

class ValidObject:
    ...

valid_obj = ValidObject()['a'].not_in([0, 1, 2])

obj1 = dict(a=1)
obj2 = dict(a=3)

valid_obj.is_valid(obj1)    # False
valid_obj.is_valid(obj2)    # True

Now obviously contains can be used to implement this:

class ValidObject:
    def not_in(self, other):
        operator.contains(other, self)

But I don’t write these manually. I use mixin for adding a set of required functionality (similarly to how dask does it), where I have operation specifications that are added to object automatically:

class ValidObject(MethodMixin):
    _methods_to_add = [NOT_IN_METHOD, SUB_METHOD]

SUB_METHOD = Namespace(name='__sub__', func=operator.sub)
NOT_IN_METHOD = Namespace(name='not_in', func=operator.not_in)

There is no such issue with reversal of numeric methods, as:

valid_obj = 1 - ValidObject()

Will correctly swap arguments.

BTW, there was a suggestion regarding rev methods for other functions Reversed ops in operator module - #2 by tjreedy. I have never needed any of those in practice.

dg-pb · June 12, 2024, 5:31pm

Also, I have used MethodMixin for various applications: ast graphs, symbolic mathematics, deferred evaluation and several others, and operator functions pretty much covered all of the needs. There were just a few that I had to write myself.

There were several, which I think are specific to how I do things and there were some candidates which I thought could be of use to others and general consistency:
a) in_ and not_in
b) logical_or and logical_and and others

a) was there for a while now so I thought maybe it is time to test the waters.

And I haven’t given enough time and thought to b) yet, but I think it could be a good candidate. E.g. Python array API standard — Python array API standard 2023.12 documentation has both bitwise and logical.

encukou · June 13, 2024, 9:22am

You can’t really replace and/or with functions: they do short-circuiting. Function arguments all need to be evaluated.
If that’s not an issue for you, and you also don’t mind getting only True or False back, you can use all/any.

dg-pb · June 13, 2024, 9:26pm

That is true, but there are 2 components to this:
a) logical operation
b) short circuiting

It is impossible to immediately address b), but it doesn’t mean that having a) in operator module is necessarily a bad idea.

opr.and_(2, 4)    -> 0

Having logical operation, even if it does not propagate short circuiting could be useful.

P.S. Once short-circuiting is addressed properly, these functions will benefit from it too.

Rosuav · June 13, 2024, 10:58pm

dg-pb:

It is impossible to immediately address b), but it doesn’t mean that having a) in operator module is necessarily a bad idea.
opr.and_(2, 4)    -> 0
Having logical operation, even if it does not propagate short circuiting could be useful.

If you need this, you can easily write it as a lambda function. I don’t think it belongs in the operator module when it has an important difference in behaviour.