Allowing missing item or attributes in operator's itemgetter and attrgetter

Hello!

I was about to refactor some code and was surprised I didn’t have this option. Is not that I need it in a lot of places, but as IMO it totally makes sense, I thought about gathering opinions here.

So, the idea: add a default keyword-only option to attrgetter and itemgetter; if the item or attribute are not present that default would be returned

IOW, today we have that after f = attrgetter('name'), the call f(b) returns b.name, and with this new option we could have that after f = attrgetter('name', default=XYZ), the call f(b) returns getattr(b, 'name', XYZ).

Correspondingly, today we have that after f = itemgetter(2), the call f(r) returns r[2], and with this new option we could have that after f = itemgetter(2, default=XYZ), the call f(r) returns r.get(2, XYZ).

Of course this would also work with multiple items or attributes.

What do you think?

Thanks!

7 Likes

It would partially cover Indexable get method. [1,2,3].get(4) # None.

At least for cases where pre-building getter is sensible.

E.g. it would be a good option to get parameters from sequence:

get_params = itemgetter(*range(5), default=EMPTY)

assert len(sys.argv) <= 5
_, a, b, c, d = get_params(sys.argv)

Apart from that, I use itemgetter and attrgetter extensively and making these more functional would certainly be beneficial.

1 Like

Also, one desirable feature for me would be a flag to always return tuple regardless of how many keys/names/indices were sourced to arguments.

The way it is now makes programatic use a bit complicated, needing to make exception when there is only 1 arg. So that the output is consistently a tuple of values.

E.g.:

IG1 = itemgetter(1, return_tuple=True)
print(IG1(['a', 'b']))    # ('b',)

E.g. Used fairly ugly conditional to fix this: gh-124652: partialmethod simplifications by dg-pb · Pull Request #124788 · python/cpython · GitHub

2 Likes

As was previously discussed in Ability to specify default values on itemgetter and attrgetter, I think it’s important to address how default values are specified when multiple items are to be returned.

It can be one default value for all items:

itemgetter('a', 'b', default=0)({}) # returns (0, 0)

Or a mapping of default values:

itemgetter('a', 'b', default={'a': 1, 'b': 0})({}) # returns (1, 0)

Or a tuple of default values:

itemgetter('a', 'b', default=(1, 0))({}) # returns (1, 0)

I’d love to have this feature too, but it should probably be discussed in a separate thread.

2 Likes

Hey everybody! Thanks for your feedback!

I think that having a default on each item opens a lot of complexities in consideration of that we really want to be able to express a simple default.

One item with one default is easy and straightforward, and several items with one overall default is easy to explain and understand.

But the moment we want to tackle multiple defaults, it’s gets more complex and the behaviour is not easily predictable, beyond if we use a dict or a tuple. What if the quantity of items in the default does not match the quantity of items? (or the names, for the case of the dict)

The really complex thing to solve here is “what if I want to pass a dict or a tuple as a single default?” E.g.:

>>> itemgetter('a', default=(1, 2)({})
(1, 2) 
>>> itemgetter('a', 'b', default=(1, 2))({})
((1, 2), (1, 2)) 

We could solve these situations having two new parameters: default for the case of one overall default value, and defaults for multiple ones (being a tuple or a dict)… I think we need to answer some questions here…

  • is this case of multiple defaults so common?
  • default and defaults aren’t too similar? (may be confusing).
  • is a good path into the future to implement default and understand after usage if defaults is really needed?

Thanks! Regards,

Just my 2c. The latter (single value) is likely the more intuitive option ~and aligns well with typing (e.g. red squiggles and type errors if you get it wrong).~

You could also make the former still work with a combination of an iterator, also allowing default_factory, and a lambda.

Edit: yeah I’m dumb ignore the length thing (and also apparently I can’t strike through either :man_facepalming:)

Thought of this too. However, multiple defaults case would ideally allow partial defaults.

It is easy with dict, however tuple case would need a sentinel to indicate NO_DEFAULT.

The issue is that dict is going to be slow, while the reason I use itemgetter is usually performance.

What about supportting single default, but providing a way to combine them? So that the user can only specify 1 default, but can combine different itemgetter objects into 1.

E.g.:

IGA = itemgetter('a', default=1)
IGBC = itemgetter('b', 'c', default=2)
IGD = itemgetter('d')

# Combine?
IG = itemgetter(IGA, IGBC, IGD, default=FBCK_DEF) # Issue: can not have `itemgetter` instance keys
IG = itemgetter.combine(IGA, IGBC, IGD, default=FBCK_DEF)
IG({})    # (1, 2, 2, FBCK_DEF)

So that the user can only specify 1 default, but itemgetter has intrinsic multiple-defaults logic which is used when objects are combined.

This way there is full customisation with simple and intuitive (at least to me) interface.

Hey! Thanks for the idea… I found that itemgetter combinations really complex. I’m reluctant to propose something that is already hard to explain how it would work :confused:

In terms of usage or implementation? Or both?

Either way, I think implementing single default could be a fairly safe bet as so far it is a common factor in both:

  1. itemgetter(..., default=..., defaults=tuple | dict).
  2. itemgetter.combine(itemgetter(..., default=), ..., default=)

Any other reasonable paths forward where single default implementation would not fit in?

I was wrong about the defaults=dict necessarily being slow. Can just preprocess so that calls do not use dict access.

Also, how would it behave with defaultdict-like containers?

if the question is how itemgetter('a', default=123))(x) would behave if x is a defaultdict or similar? the behaviour is simple and straightofoward: it would do the same than x.get("a", 123).

If the question was different,I’m not understanding it, sorry :confused:

No, you understood correctly. Maybe there was a lack of context why I asked this.

There are different paths to apply default:

# 1.
try:
    return obj[item]
except (KeyError, IndexError):
    return default

# 2.
return obj.get(item, default)

# 3. To replicate 2. without calling `get` method:
if item in obj:
    return obj[item]
else:
    return default

2. and 3. only work for dict and not sequence, while 1. is incorrect for dict. Maybe there is some good way to make this correct and simple.

In short, I agree, it should work exactly the same as dict.get.

Ahhhhhhh, this is a really good question!!!

I think (1) would better match the behaviour that we want to produce, but it would be a little expensive to give results in case of defaults kicking in?

(2) is what I originally had in mind but it will only work with dicts (or whatever have get) implemented, but not lists for example.

(3) is discarded, I think, it not only works in some cases only but also would be more expensive for all cases (IOW, “current” cases)

I don’t understand why you say that (1) would be incorrect for dicts.

Furthermore, I think (1) is the “way to go”, with the caveat of expressing that explicitly in the docs to note that it may be more expensive than doing it all by hand if you know with which data type you’re dealing with.

BTW, this “complexity” only applies to itemgetter, not attrgetter, right?

1 Like

I meant for defaultdicts or any object that already implements fallback defaults.

E.g. ChainMap does (3) to account for this.

But ChainMap is expected to be used with mappings and I agree that this doesn’t need to be the same.

I think leaving this to behave as (1) might be best.
If mental model is such that itemgetter calls __getitem__ and returns default on error, then all works simple and well.

attrgetter potentially has the same issue, but I don’t think there is a need to overcomplicate these.

I think these are mostly used for pre-compiled performance gain. In this case anyone who has defaultdict can “pre-compile” it to dict or employ some other strategy to do the right thing depending on a situation.

Would it make sense to add optional default to operator.getitem at the same time?

Maybe! getattr do have it!

1 Like