PEP 769: Add a 'default' keyword argument to 'attrgetter' and 'itemgetter'

Hello everybody!

I’m very happy to share with you PEP 769, which aims to enhance the operator module by adding a default keyword argument to the attrgetter and itemgetter functions. This addition would allow these functions to return a specified default value when the targeted attribute or item is missing, thereby preventing exceptions and simplifying code that handles optional attributes or items.

Currently, attrgetter and itemgetter raise exceptions if the specified attribute or item is absent. This limitation requires developers to implement additional error handling, leading to more complex and less readable code.

Introducing a default parameter would streamline operations involving optional attributes or items, reducing boilerplate code and enhancing code clarity.

The new PEP is available online. I’ve also included the text at the bottom of this post.

Feedback is welcomed! Thank you very much :slight_smile:

. Facundo

(expand for the complete PEP text)

PEP: 769
Title: Add a ‘default’ keyword argument to ‘attrgetter’ and ‘itemgetter’
Author: Facundo Batista facundo@taniquetil.com.ar
Status: Draft
Type: Standards Track
Created: 22-Dec-2024
Python-Version: 3.14

Abstract

This proposal aims to enhance the operator module by adding a
default keyword argument to the attrgetter and itemgetter
functions. This addition would allow these functions to return a
specified default value when the targeted attribute or item is missing,
thereby preventing exceptions and simplifying code that handles optional
attributes or items.

Motivation

Currently, attrgetter and itemgetter raise exceptions if the
specified attribute or item is absent. This limitation requires
developers to implement additional error handling, leading to more
complex and less readable code.

Introducing a default parameter would streamline operations involving
optional attributes or items, reducing boilerplate code and enhancing
code clarity.

Rationale

The primary design decision is to introduce a single default parameter
applicable to all specified attributes or items.

This approach maintains simplicity and avoids the complexity of assigning
individual default values to multiple attributes or items. While some
discussions considered allowing multiple defaults, the increased
complexity and potential for confusion led to favoring a single default
value for all cases (more about this below in Rejected Ideas <PEP 769 Rejected Ideas_>__).

Specification

Proposed behaviours:

  • attrgetter: f = attrgetter("name", default=XYZ) followed by
    f(obj) would return obj.name if the attribute exists, else
    XYZ.

  • itemgetter: f = itemgetter(2, default=XYZ) followed by
    f(obj) would return obj[2] if that is valid, else XYZ.

This enhancement applies to single and multiple attribute/item
retrievals, with the default value returned for any missing attribute or
item.

No functionality change is incorporated if default is not used.

Examples for attrgetter

Current behaviour, no changes introduced::

>>> class C:
...   class D:
...     class X:
...       pass
...   class E:
...     pass
...
>>> attrgetter("D")(C)
<class '__main__.C.D'>
>>> attrgetter("badname")(C)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'C' has no attribute 'badname'
>>> attrgetter("D", "E")(C)
(<class '__main__.C.D'>, <class '__main__.C.E'>)
>>> attrgetter("D", "badname")(C)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'C' has no attribute 'badname'
>>> attrgetter("D.X")(C)
<class '__main__.C.D.X'>
>>> attrgetter("D.badname")(C)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'D' has no attribute 'badname'

Using default::

>>> attrgetter("D", default="noclass")(C)
<class '__main__.C.D'>
>>> attrgetter("badname", default="noclass")(C)
'noclass'
>>> attrgetter("D", "E", default="noclass")(C)
(<class '__main__.C.D'>, <class '__main__.C.E'>)
>>> attrgetter("D", "badname", default="noclass")(C)
(<class '__main__.C.D'>, 'noclass')
>>> attrgetter("D.X", default="noclass")(C)
<class '__main__.C.D.X'>
>>> attrgetter("D.badname", default="noclass")(C)
'noclass'

Examples for itemgetter

Current behaviour, no changes introduced::

>>> obj = ["foo", "bar", "baz"]
>>> itemgetter(1)(obj)
'bar'
>>> itemgetter(5)(obj)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> itemgetter(1, 0)(obj)
('bar', 'foo')
>>> itemgetter(1, 5)(obj)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

Using default::

>>> itemgetter(1, default="XYZ")(obj)
'bar'
>>> itemgetter(5, default="XYZ")(obj)
'XYZ'
>>> itemgetter(1, 0, default="XYZ")(obj)
('bar', 'foo')
>>> itemgetter(1, 5, default="XYZ")(obj)
('bar', 'XYZ')

… _PEP 769 About Possible Implementations:

About Possible Implementations

For the case of attrgetter is quite direct: it implies using
getattr catching a possible AttributeError. So
attrgetter("name", default=XYZ)(obj) would be like::

try:
    value = getattr(obj, "name")
except (TypeError, IndexError, KeyError):
    value = XYZ

Note we cannot rely on using gettattr with a default value, as would
be impossible to distinguish what it returned on each step when an
attribute chain is specified (e.g.
attrgetter("foo.bar.baz", default=XYZ)).

For the case of itemgetter it’s not that easy. The more
straightforward way is similar to above, also simple to define and
understand: attempting __getitem__ and catching a possible exception
(any of the three indicated in __getitem__ reference). This way,
itemgetter(123, default=XYZ)(obj) would be equivalent to::

try:
    value = obj[123]
except (TypeError, IndexError, KeyError):
    value = XYZ

However, this would be not as efficient as we’d want for particular cases,
e.g. using dictionaries where particularly good performance is desired. A
more complex alternative would be::

if isinstance(obj, dict):
    value = obj.get(123, XYZ)
else:
    try:
        value = obj[123]
    except (TypeError, IndexError, KeyError):
        value = XYZ

Better performance, more complicated to implement and explain. This is
the first case in the Open Issues <PEP 769 Open Issues_>__ section later.

Corner Cases

Providing a default option would only work when accessing to the
item/attribute would fail in a regular situation. In other words, the
object accessed should not handle defaults theirselves.

For example, the following would be redundant/confusing because
defaultdict will never error out when accessing the item::

>>> from collections import defaultdict
>>> from operator import itemgetter
>>> dd = defaultdict(int)
>>> itemgetter("foo", default=-1)(dd)
0

The same applies to any user built object that overloads __getitem__
or __getattr__ implementing fallbacks.

… _PEP 769 Rejected Ideas:

Rejected Ideas

Multiple Default Values

The idea of allowing multiple default values for multiple attributes or
items was considered.

Two alternatives were discussed, using an iterable that must have the
same quantity of items than parameters given to
attrgetter/itemgetter, or using a dictionary with keys matching
those names passed to attrgetter/itemgetter.

The really complex thing to solve in these casse, that would make the
feature hard to explain and with confusing corners, is what would happen
if an iterable or dictionary is the unique default desired for all
items. For example::

>>> itemgetter("a", default=(1, 2)({})
(1, 2)
>>> itemgetter("a", "b", default=(1, 2))({})
((1, 2), (1, 2))

If we allow “multiple default values” using default, the first case
in the example above would raise an exception because more items in the
default than names, and the second case would return (1, 2)). This is
why emerged the possibility of using a different name for multiple
defaults (defaults, which is expressive but maybe error prone because
too similar to default).

As part of this conversation there was another proposal that would enable
multiple defaults, which is allowing combinations of attrgetter and
itemgetter, e.g.::

>>> ig_a = itemgetter("a", default=1)
>>> ig_b = itemgetter("b", default=2)
>>> ig_combined = itemgetter(ig_a, ig_b)
>>> ig_combined({"a": 999})
(999, 2)
>>> ig_combined({})
(1, 2)

However, combining itemgetter or attrgetter is a totally new
behaviour very complex to define, not impossible, but beyond the scope of
this PEP.

At the end having multiple default values was deemed overly complex and
potentially confusing, and a single default parameter was favored for
simplicity and predictability.

Tuple Return Consistency

Another rejected proposal was adding a a flag to always return tuple
regardless of how many keys/names/indices were sourced to arguments.
E.g.::

>>> letters = ["a", "b", "c"]
>>> itemgetter(1, return_tuple=True)(letters)
('b',)
>>> itemgetter(1, 2, return_tuple=True)(letters)
('b', 'c')

This would be of a little help for multiple default values consistency,
but requires further discussion and for sure is out of the scope of this
PEP.

… _PEP 769 Open Issues:

Open Issues

Behaviour Equivalence for itemgetter

We need to define how itemgetter would behave, if just attempt to
access the item and capture exceptions no matter which the object, or
validate first if the object provides a get method and use it to
retrieve the item with a default. See examples in the About Possible Implementations <PEP 769 About Possible Implementations_>__ subsection
above.

This would help performance for the case of dictionaries, but would make
the default feature somewhat more difficult to explain, and a little
confusing if some object that is not a dictionary but provides a get
method is used. Alternatively, we could call .get only if the
object is an instance of dict.

In any case, a desirable situation is that we do not affect performance
at all if the default is not triggered. Checking for .get would
get the default faster in case of dicts, but implies doing a verification
in all cases. Using the try/except model would make it not as fast as it
could in the case of dictionaries, but would not introduce delays if the
default is not triggered.

Add a Default to getitem

It was proposed that we could also enhance getitem, as part of the of
this PEP, adding default also to it.

This will not only improve getitem itself, but we would also gain
internal consistency in the operator module and in comparison with
the getattr builtin function that also has a default.

The definition could be as simple as the try/except proposed above, so
doing getitem(obj, name, default) would be equivalent to::

try:
    result = obj[name]
except (TypeError, IndexError, KeyError):
    result = default

(However see previous open issue about special case for dictionaries)

How to Teach This

As the basic behaviour is not modified, this new default can be
avoided when teaching attrgetter and itemgetter for the first
time, and can be introduced only when the functionality need arises.

Backwards Compatibility

The proposed changes are backward-compatible. The default parameter
is optional; existing code without this parameter will function as
before. Only code that explicitly uses the new default parameter will
exhibit the new behavior, ensuring no disruption to current
implementations.

Security Implications

Introducing a default parameter does not inherently introduce
security vulnerabilities.

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.

6 Likes

Link to previous discussion Allowing missing item or attributes in operator's itemgetter and attrgetter

1 Like

Thanks for writing this PEP!

Aside: I’ll have a PR shortly to fix a few typos and rephrase a few things for clarity. I hope it helps!

Regarding defaults for multiple, dot-separated names in attrgetter(), it occurs to me – and I haven’t thought this fully through – that you could allow default to be a callable, which would have an API along the lines of:

def resolve(
    success_path: str, 
    fail_path: str,
    last_object: object,
) -> object:

The idea being, let’s say you do this:

>>> attrgettr('a.b.c.d', default=resolve)(obj)

and let’s say you have a.b.c but that object has no d attribute. resolve() would get called with:

resolve('a.b.c', 'd', a.b.c)

The question that comes to mind is whether you could use this to implement the ?. functionality in PEP 505 this way, i.e. None-aware attribute access? Forget about the syntax in PEP 505, but just the functional equivalent?

Maybe – and I really haven’t thought about this! – something similar for itemgetter() and ?[], i.e. the None-aware indexing operator.

Maybe it’s not possible, or the signature of resolve() needs to be changed, or doesn’t make sense, but I wanted to throw it out there as it doesn’t appear to be covered in the rejected ideas of PEP 769.

There’s one glitch: what if you wanted to return a callable as the default? I think there are possible workarounds, such as defining a protocol/interface that a default callable must adhere to, or always using a 1-tuple default such as (function,).

2 Likes

But what if you want return a callable as result? There’s no way to distinguish between this case.

Similar to this:

getattr(sys, "get_int_max_str_digits", lambda: 0)()

Would probably need to add something along the lines of DefaultResolver mixin, then if isinstance(default, DefaultResolver):...

Wouldn’t it make the most sense to add this as a separate keyword (most likely as a separate PEP)?

2 Likes

That’s the glitch I mentioned above:

I’m inclined to think this is over-complicating the API. Much like PEP 505, there are a lot of options that people might want, and no clear “best” one. The attrgetter and itemgetter functions are largely for performance (after all, they are equivalent to a lambda function), and it’s not at all clear that the sorts of situations where a default callable would be useful are performance critical (or that calling the default function is faster than simply writing the whole thing as a Python function in the first place).

Let’s keep PEP 769 simple.

5 Likes

Agreed. I made the suggestion not because I think it’s necessarily a good suggestion[1], but because it isn’t covered in the PEP, and it might spur someone to think of a better way. I also wanted to explore possibly functional, non-syntactic support for PEP 505, which TBH I am not a fan of.


  1. I can’t remember the last time I wanted a default for itemgetter or attrgetter with or without the suggestion ↩︎

2 Likes

Let’s just put this (and any other “attempt to support the use cases people want PEP 505 for”) in the “Rejected Ideas” section, then.

I’m also not a fan of PEP 505. I really don’t see why existing non-syntactic solutions like the glom library are insufficient :slightly_frowning_face:

4 Likes

Sorry, I stopped reading your post when you started to talk about safe navigation.

@pf_moore @barry Yes, I agree. I also want to keep this PEP simple.

Having a straightforward default is simple and easy to explain, easy to remember, and will provide value to the users. I don’t like to make it more complicated with specific details (“if this that you pass is a function inside a tuple of length one and you press enter with your left hand while singing a song, it will do foo and bar”… just a note of humor :slight_smile: ).

I’ll expand a little the “rejected ideas” to cover some of this.

But so far the PEP is sound?

What do you think about the open issues? Using .get when it’s a dictionary, or always try/except, and also adding a default to getitem?

Thank you all for the feedback!