Add safe `.get` method to List

NoahTheDuke · August 29, 2023, 6:27pm

To access an item in a dictionary, you use indexing: d["some_key"]. However, if the key doesn’t exist in the dictionary, a KeyError is raised. To avoid this, you can use .get and pass in a default value to return instead: d.get("some_key", "default value").

Lists don’t have such a method for safely indexing. There are many ways around it (as documented in many SO threads: thread 1, thread 2), but they’re all pretty awkward and they aren’t composable or chain-able the same way that .get() is.

This could maybe be applied to Tuples, but I’d prefer to keep scope small. And I don’t think this applies to Iterables in general, but I don’t know enough details about those to say.

I searched the threads and PEPs but not the mailing lists and I haven’t seen anything recommending this. I looked at the code and I think this shouldn’t be too hard to implement. I’d be willing to take a swing at it or writing a PEP if there’s any interest.

barry-scott · August 29, 2023, 6:31pm

This comes up every now and again.

The argument against this is that lists are iterated or code knows the length so no safe get is useful.

Do you have a case where that argument is not valid and not a rare use case?

NoahTheDuke · August 29, 2023, 7:59pm

I understand that argument, but it’s not true that lists are only iterated. For example, I have a list of objects inside of a nested dictionary. I want to get a property of the first item in the list:

state = {"corp": {"hand": [{"title": "Sure Gamble"}]}}

state.get("corp", {}).get("hand", [])[0].get("title", "")

The .get() calls allow me to write a somewhat fluent and safe getter for the nested objects. But this blows up if the list is empty. To work around that, I have to store in a variable and then either check the length of the list or wrap the indexing in a try/except. If I’m using a variable index instead of a literal 0, then I need to use try/except because of negative indexes.

That’s a fairly complex example, but I think even the simple examples would benefit. For example, I want to look at the current item and the next item in a for loop. The following code blows up but with a safe get, it would do the “right” thing without blowing up:

lst = list(range(10))
for idx, item in enumerate(lst):
    next_item = lst[idx + 1]
    print(item, next_item)

I know the argument that there are ways around these situations (see the SO threads linked above) or that a simple if/then is easy to use, so we shouldn’t bloat the standard library. I think that such an argument exists for dictionaries as well. Why can’t people just use if "some_key" in d: ret = d["some_key"] else: ret = "default value"?

Given that dict.get() is both used and loved enough to make people want it for list, I see great value in adding it to list.

pf_moore · August 29, 2023, 8:09pm

Maybe look at the glom library on PyPI, which as far as I know is designed to handle this sort of use case specifically.

As far as the proposal here is concerned, I find the chained get expression incredibly difficult to read, and I’m -1 on adding features to the language that encourage this type of thing.

Rosuav · August 29, 2023, 8:33pm

Noah Bogart:

state.get("corp", {}).get("hand", [])[0].get("title", "")
The .get() calls allow me to write a somewhat fluent and safe getter for the nested objects. But this blows up if the list is empty. To work around that, I have to store in a variable and then either check the length of the list or wrap the indexing in a try/except. If I’m using a variable index instead of a literal 0, then I need to use try/except because of negative indexes.

Honestly, this sounds like a great use for PEP 463…

NoahTheDuke · August 29, 2023, 8:40pm

I don’t think the chained .get() is common, just something I ran into today. I suspect the majority of use-cases are single calls.

If that’s your final opinion tho, I’ll trust your opinion and leave it be. Thanks for the feedback.

NoahTheDuke · August 29, 2023, 8:42pm

Thanks for the reference PEP. It’s interesting to see this situation referenced but a completely alternate approach taken (and subsequently rejected).

Rosuav · August 29, 2023, 8:56pm

Yeah. Part of the reasoning is that the situations where this would be useful are relatively rare, and are often best handled with a path-traversal function, such as:

def get(state, *path):
    try:
        for step in path:
            state = state[step]
    except LookupError:
        return None
    return state

get(state, "corp", "hand", 0, "title")

If you wanted, you could even make this a __getitem__ method on a class that always returns another instance of itself, although personally, I’d go for the simpler option of a helper function (since this sort of data often gets loaded/saved in JSON and it’s easier to use real dicts and lists).

NoahTheDuke · August 30, 2023, 2:52pm

That’s true, and the alternatives are fairly easy. It’s just nice to provide a built-in that makes specific use-cases easier, cover what looks like a gap in the api. Given that the first reply said it comes up “every now and again” and I found two heavily upvoted threads on Stack Overflow, I’d expect that to weigh a little bit for adding something to make this easier on folks (like me lol).

NeilGirdhar · August 30, 2023, 4:01pm

Just FYI, there’s a nice way to write this:

from itertools import pairwise
lst = list(range(10))
for item, next_item in pairwise(lst):
    print(item, next_item)

flyinghyrax · August 30, 2023, 4:47pm

I would be on board, and IMO this should just be added to collections.abc.Sequence as a generic method of indexable collections.

Like OP I run into this paper cut fairly often, and feel existing solutions have shortcomings compared to the proposed method.

Besides the symmetry with dict.get, I think this utility method mirrors similar collection APIs in other language standard libraries (but I’ll need to cite that later when I’m off work).

storchaka · August 30, 2023, 5:01pm

Just use an external function:

def get(mapping, key, default=None):
    if key in mapping:
        return mapping[key]
    else:
        return default

flyinghyrax · August 30, 2023, 6:43pm

As has been pointed out already,

You could just as well say the same for dict.get, or for any other non-dunder collection method.
We already know how to write wrapper functions and other utilities for this use case.

The point is that one writes this same function over and over again. I literally just noticed I am staring at a version of this function right now at work:

def eval_escape_sequences(token_stream: Iterable[Token]) -> Iterable[Token]:
  output_queue: deque[Token] = deque()

  def peek_previous() -> Optional[Token]:
    if len(output_queue) > 0:
      return output_queue[0]
    else:
      return None

  …

I’ve likely written a function like this - or a more generic version - for every project of significant size I’ve ever worked on.

pf_moore · August 30, 2023, 7:42pm

… and yet, the proposed .get method on lists wouldn’t help, because you’re using a deque not a list.

elis.byberi · August 30, 2023, 11:46pm

You can use the “+” operator:

state = {"corp": {"hand": [{"title": "Sure Gamble"}]}}
a = (state.get("corp", {}).get("hand", []) + [{}])[0].get("title", "")
print(a)

flyinghyrax · August 31, 2023, 12:55am

Fair enough, but that’s why I mentioned Sequence. That said, I’m not really familiar with how collections.abc registration works for built in / native collections like deque. Would it even “inherit” a method added to Sequence, or does registering only declare what interfaces it implements for isinstance checks?

Additionally I have been skimming docs for some other languages ^[1], and so far the proposed behavior isn’t as widespread as I assumed! At least not for a prominent method name like get.

C#/dotnet’s List<T> has the Enumerable.ElementAtOrDefault, but that’s an extension method, not defined directly on List or any of its parent interfaces
F#'s List.item throws an exception for out of bounds access; List.tryItem will return Optional.None but is clearly the secondary interface of the two
Java’s ArrayList.get throws, and none of the parent types in the collections package implement a ‘safe get’ method that return null as far as I can tell. The Collections helper class doesn’t include one either.
Swift’s Array indexing returns Self.Element not Self.Element? - i.e. is not nillable - and I don’t see any “safe get” method that supports specifying a default or returning nil. (The first and last properties are nillable, but that’s not arbitrary index access.)
Rust’s std::vec::Vec<T> does have a get method that returns an Option
Kotlin’s kotlin.collections.List has a get method (that backs the indexing operator [], I think?) that requires in-bounds and I assume throws an exception otherwise. It does have extension methods similar to dotnet’s: getOrElse/getOrNull, and also (??? ^[2]) elementAtOrElse/elementAtOrNull.
Scala’s Vector[+A] is… it’s wild, and now I need to go learn Scala. ^[3]
Ruby’s Array seems to have nillable indexing by default (how fun!), and two alternative methods at and fetch - fetch can either throw an error or return a default, at seems under-documented ^[4]
JavaScript’s array indexing and Array.at return undefined for out-of-bounds indices ^[5], but you can’t specify a default.

So far, only Ruby’s fetch seems to make ‘get or default’ a prominent operation. Rust and JavaScript have prominent methods to return ‘nothing’ instead of throwing an error, but not with a specified default value. ^[6]

TL;DR: I’ve convinced myself the proposed function isn’t as common as I thought.

…no methodology to speak of, suggestions welcome ↩︎
If someone can say why this is, I’d love to know ↩︎
apply works like ‘regular’ indexing (requires in-bounds), applyOrElse returns an Optional, and it also inherits orElse,and lift from PartialFunction that might be relevant? ↩︎
and Ruby is another hit list language that I don’t know much about yet… ↩︎
How “safe” that is in the context of the rest of JavaScript is debatable I guess, but I’m biased ↩︎
Rust’s get lets you chain functions from Option, so I guess there’s an obvious design reason they wouldn’t bother with a ‘default’ parameter overload for get. ↩︎

ajoino · August 31, 2023, 7:32am

In this particular case, you could populate that empty list with a meaningful value, then that line wouldn’t fail.

NoahTheDuke · August 31, 2023, 7:52pm

Thanks for the overview!

I am primarily a Clojure developer and it has the built-in functions get and get-in. Because Clojure is function-based, not method-based, get and get-in are polymorphic and work on any associative data structure (which includes Clojure’s vectors). They return nil when accessing out of bounds and they can be passed default values.

To prospective commenters, I know about the alternative ways to write my example code. I appreciate the enthusiasm but I linked to two SO threads that contain every suggestion posted so far. I created this thread to discuss whether to add something new to cover these alternatives. Seems prudent to stay on topic here.

elis.byberi · September 1, 2023, 1:20am

That’s not very idiomatic in Python; you might want to consider using a try/except block instead:

state = {"corp": {"hand": [{"title": "Sure Gamble"}]}}

item = ''
try: item = state["corp"]["hand"][0]["title"]
except (KeyError, IndexError): pass

print(item)

I haven’t read any Stack Overflow threads because the discussion is taking place here. What specifically do you find awkward?

ruro · September 3, 2023, 1:51pm

Interestingly, PEP 463 was rejected, because

I disagree with the position that EAFP is better than LBYL, or “generally recommended” by Python.

And yet list.get and similar proposals are frequently criticized for not using the “Pythonic” way (EAFP).

IMHO, something like

print(state["corp"]["hand"][0]["title"] except LookupError: "default")

is significantly easier to read than the suggested

item = "default"
try:
    item = state["corp"]["hand"][0]["title"]
except LookupError:
    pass

print(item)