Revisiting PEP 505

guido · December 19, 2024, 4:12am

Indeed this is what I’ve been advocating – though the OP and a few others in this thread reject the idea.

guido · December 19, 2024, 4:32am

Marc Mueller:

guido:
Similarly we can see a.b?.c as sugar for
__tmp.c if(__tmp := getattr(a, "b", None)) is not None else None
What happens if for some reason a never has an attribute b? Wouldn’t the expression always return None? I can understand the desire for it, but continue using getattr seems to be the better option for me.

Yes, if a never has an attribute b it will always return None. What else would you expect? (But note that when using optional static typing, the type checker would complain if you wrote a?.b and the type of a does not have an optional field b. (The type system doesn’t currently know whether a required field is actually present, but it does know whether a field is declared or not – it will flag a.b and a?.b equally of there’s no b in the type of a.)

I expect that if ?. only worked on None and not on absence, I just wouldn’t use it rather than using getattr(), and ditto for .get() and ?[].

PS. For some reason JavaScript/TypeScript spells the latter as ?.[], and they don’t seem to think there’s anything illogical about it (I asked Anders :-). I suspect it’s because they have the ? : ternary operator though. I don’t see a reason to follow their example though. (It would make adding ? : to Python harder though, and it would be another puzzle to figure out in PypeScript, a merger of Python and TypeScript that I’m thinking about.

guido · December 19, 2024, 4:43am

The point is that (in my version) the ? modifies the semantics of an x.attr or x[key] operation immediately preceding it. Which IMO is fair game because it’s all syntax.

Hm, this brings me to another even more radical idea: make ? a postfix operator that means "and if the preceding thing is some x.a that raises AttributeError, or x[k] raising KeyError, or if the preceding value is None, the result is None, and subsequent attr, key and call operators are skipped. This not only gives us foo?(...), it also gives us bare a.b? and a[k]? to spell getattr(a, "b", None) and a.get(k) more concisely.

Oh, regardless of whether we do that, we should probably also suppress IndexError coming out of x[k] in case x is a sequence.

noahbkim · December 19, 2024, 5:18am

I was thinking about something like this as well! To take this even further, what if expression evaluation could stop as soon as a ? check is failed, propagating None upwards to the top-most AST node of the expression? For example, x = foo(a, b, c?) when c is None would evaluate as x = None.

In any event, it’s clear to me that the next step here is to separate the PEP into ??/??=, which seems to have majority support, and leave ?. for further discussion. Assuming this sounds good, I’ll start working on that as I have time.

blhsing · December 19, 2024, 5:45am

That would be a step too far. In my use case oftentimes there are many options to a function call with arguments derived from options specified in a configuration file. The call is still to be made no matter what, with only the individual options set to None when absent from the configuration.

guido · December 19, 2024, 5:51am

Also, note that this thread is almost entirely between very few users. Maybe post under a new topic in January once more people are back to work/school?

bavalpey · December 19, 2024, 5:52am

It sort of pains me to see this being the direction. Is a 4 character difference really that verbose?
The extra characters are English, meaning your brain doesn’t have to do any translation. I personally find
a["b"]?["c"]?["d"]? more cumbersome to parse mentally than a.get("b")?.get("c")?.get("d") – each operator has one responsibility; I don’t have to remember whether ? is going to just swallow None or also key/index errors

blhsing · December 19, 2024, 5:58am

The case is much stronger with a?.b?.c? as an alternative to getattr(getattr(a, "b", None), "c", None) though, since we don’t have a forgiving version of object.__getattribute__ like dict.get.

But then if we are to allow a?.b?.c?, we might as well allow a?["b"]?["c"]? for a consistent syntax.

I like this generalization a lot. +1

bavalpey · December 19, 2024, 6:06am

I presume by forgiving you mean one that returns None by default?
I always did wish that .getattr() returned None if the attribute didn’t exist. It seems silly that it doesn’t while .get() does. (Though I can only assume this is for some very good reason).

Regardless, you’ve convinced me. I very much want to be able to use this form for the a.getattr("b"), when b does not exist. To swallow the AttributeError in this case, but not the KeyError/IndexError in the other makes as little sense as .getattr() not returning None by default.

I wholeheartedly agree. +1 from me as well

carljm · December 19, 2024, 6:14am

Is it, though? Where is the evidence that this is such a common pattern that it needs syntax-level support? The case I hear mentioned is working with unstructured data, eg from an API, but isn’t such data typically in lists and dictionaries? Unless you anre doing something (in my experience) unusual, class instances have a relatively fixed set of attributes that are always available. If I see attribute access style for data from an API, I would assume it’s already been validated to a known schema via something like Pydantic. Of course it’s possible to have exceptions, but do we want the language to encourage this style?

bavalpey · December 19, 2024, 6:19am

Speaking for myself, this pattern appears quite frequently in the code I work on. It complements the duck typing very nicely. That is, we can pass around different objects that may have different properties to the same method.

e.g. Imagine this:

my_bar = foo.bar? ?? DefaultFoo()

# as opposed to

my_bar = bar if (bar := getattr(foo, "bar", None)) is not None else DefaultFoo()

Nineteendo · December 19, 2024, 6:36am

Couldn’t we add a magic __sentinel__ method to make this customizable? This wouldn’t allow to use None as value though.

blhsing · December 19, 2024, 7:10am

I personally prefer accessing items via dot notation over subscript so I use the wrapper from dict2dot · PyPI to access dict keys as attributes. Although you are right that a predefined validation schema is the right thing to do for the long term we often don’t do that for smaller projects just to get things working quickly.

CPython itself also has a data model where certain attributes are optional for some data types due to dynamic initializers, duck typing or backwards compatibility. Below are a few examples that can benefit from Guido’s generalization:

github.com

python/cpython/blob/48c70b8f7dfd00a018abbac50ea987f54fa4db51/Lib/inspect.py#L881


      
          
              # return a filename found in the linecache even if it doesn't exist on disk
              if filename in linecache.cache:
                  return filename
              if os.path.exists(filename):
                  return filename
              # only return a non-existent filename if the module has a PEP 302 loader
              module = getmodule(object, filename)
              if getattr(module, '__loader__', None) is not None:
                  return filename
              elif getattr(getattr(module, "__spec__", None), "loader", None) is not None:
                  return filename
          
          def getabsfile(object, _filename=None):
              """Return an absolute path to the source or compiled file for an object.
          
              The idea is for each object to have a unique origin, so this routine
              normalizes the result as much as possible."""
              if _filename is None:
                  _filename = getsourcefile(object) or getfile(object)
              return os.path.normcase(os.path.abspath(_filename))

github.com

python/cpython/blob/48c70b8f7dfd00a018abbac50ea987f54fa4db51/Lib/inspect.py#L723


      
                      except AttributeError:
                          continue
                      if doc is not None:
                          return doc
              return None
          
          if ismethod(obj):
              name = obj.__func__.__name__
              self = obj.__self__
              if (isclass(self) and
                  getattr(getattr(self, name, None), '__func__') is obj.__func__):
                  # classmethod
                  cls = self
              else:
                  cls = self.__class__
          elif isfunction(obj):
              name = obj.__name__
              cls = _findclass(obj)
              if cls is None or getattr(cls, name) is not obj:
                  return None
          elif isbuiltin(obj):

github.com

python/cpython/blob/48c70b8f7dfd00a018abbac50ea987f54fa4db51/Lib/importlib/_bootstrap.py#L1355


      
          _NEEDS_LOADING = object()
          
          
          def _find_and_load(name, import_):
              """Find and load the module."""
          
              # Optimization: we avoid unneeded module locking if the module
              # already exists in sys.modules and is fully initialized.
              module = sys.modules.get(name, _NEEDS_LOADING)
              if (module is _NEEDS_LOADING or
                  getattr(getattr(module, "__spec__", None), "_initializing", False)):
                  with _ModuleLockManager(name):
                      module = sys.modules.get(name, _NEEDS_LOADING)
                      if module is _NEEDS_LOADING:
                          return _find_and_load_unlocked(name, import_)
          
                  # Optimization: only call _bootstrap._lock_unlock_module() if
                  # module.__spec__._initializing is True.
                  # NOTE: because of this, initializing must be set *before*
                  # putting the new module in sys.modules.
                  _lock_unlock_module(name)

And here are the dozens of other instances of getattr(obj, "attr", None) in CPython where certain attributes are optional in the data model.

nathan-chappell · December 19, 2024, 7:38am

I was trying to make the point earlier that a lot of the ?. and ?[] behavior seems like it belongs as part of a collection. Even for exotic objects some sort of proxy that digs through the accesses and throws when it “should” seems like it would solve the issue rather than adding syntax. I would be for the simplification as a postfix operator mentioned by Guido, only because it is simple and straightforward, and probably what the proposed collections / proxies would do anyways (but I’m curious to see what pathological examples it gives rise to).

ajoino · December 19, 2024, 9:48am

So IIUC it would be equivalent to

try:
    my_bar = foo.bar if foo.bar is not None else DefaultFoo()
except AttributeError:
    my_bar = DefaultFoo()

right? Your lower example is a bit terse

In this case I think that the proposed operator could improve code readability, but I mostly agree with the people saying that if you don’t know if an attribute exists on an object you’re likely dealing with data that should have been modeled as lists and dicts instead.

jamesdow21 · December 19, 2024, 10:44am

I had participated a bit in one of the previous threads about this PEP a few months ago and had come to the realization that the ?. and ?[] operators both made a lot of intuitive sense if you instead treated the ? as a unary postfix operator

I’ll also add a shameless plug for a very long comment I made showing how the None-aware operators (and specifically just None-aware, not AttributeError or KeyError silencing) can be quite useful in a real world use case I have in my job doing analysis for a manufacturing company.

The overall point there is that if you are dealing with a entity-relationship model, long chains of attribute lookups are quite reasonable and in many cases it is an expected case that there could be a null value somewhere along the chain since relationships can be “total” (i.e. required/non-null) or “partial” (i.e. nullable) and the natural way to express that in Python is by using None

elis.byberi · December 19, 2024, 1:18pm

We don’t need a try/except block because the optional value is deliberate; that is, the value cannot be another falsy value:

my_bar = foo.bar or DefaultFoo() if foo else None

Also, the proposed syntax won’t work with other falsy values either, so that’s a fair comparison.

If the bar attribute of the object foo does not exist, it suggests that the proposed feature is more about suppressing exceptions related to duck typing. I find this counterproductive because incorrect duck typings are bugs.

mikeshardmind · December 19, 2024, 2:53pm

I don’t find any of the PEP 505 syntax compelling. The most compelling piece of it is ??, but the place it would see the greatest benefit, we could get a better benefit with late bound defaults (PEP 671) which would prevent None (or other sentinel objects) from needing to be used for the most common cases, like mutable defaults.

Python’s ecosystem has a rich set of libraries that can parse JSON data into structured Python objects, including filling defaults and rejecting data with a meaningful error when required information is missing; Some of these libraries are faster than the standard library’s json.loads while doing that extra work^[1]. I’d rather see people reach for tools that stop putting None where they don’t want it instead of creating extra syntax for handling None where people don’t want it.

Something I’ve noticed in a few js libraries I’ve had to dig through is that they never “fix” the missing data, they always act like it might not be there. This turns into a self-fueling need for the syntax. This is also not what I would consider well-designed code. There’s no proper separation between “we don’t know what data we have” and “Okay, we’ve validated the data, now we can use it” I don’t know how prevalent that is in general, but given how I’ve seen people say they would use the safe traversal in python, I’m led to believe it’s what people want to do in python too.

Benchmarks ↩︎

Nineteendo · December 19, 2024, 2:58pm

Sounds exactly like what Steve fears would happen in Python too…

ajoino · December 19, 2024, 3:38pm

Another thing that should be addressed for the .? and ?[] proposal is how this proposed syntax is better than using pattern matching. PEP 636 (pattern matching tutorial) suggest that there is some overlap between what can be done with the proposed operators and pattern matching, see the “Going to the clouds: Mappings” section. I don’t use pattern matching much (still mostly working with Python 3.8, sigh) so I’m not sure if pattern matching is used like was expected but any future PEP should address why pattern matching is not enough.