`list()` constructor and `__len__` method

Python 3.12.2

class A:
    def __iter__(self):
        for i in range(8):
            yield i
    def __len__(self):
        raise NotImplementedError
a = A()
list(a)
NotImplementedError                       Traceback (most recent call last)
Cell In[30], line 8
      6         raise NotImplementedError
      7 a = A()
----> 8 list(a)
      9 len(a)

Cell In[30], line 6, in A.__len__(self)
      5 def __len__(self):
----> 6     raise NotImplementedError

NotImplementedError: 

but

class A:
    def __iter__(self):
        for i in range(8):
            yield i
a = A()
list(a)
[0, 1, 2, 3, 4, 5, 6, 7]

Could we handle NotImplementedError, NotImplemented?

What would be the use case for this?

The purpose of NotImplemented and NotImplementedError is to work with the cooperative protocol for binary operators (getting the length of a sequence is unary), or to mark something as explicitly not supported - in particular, in abstract base classes.

There’s no reason to add a __len__ that raises NotImplementedError (or returns NotImplemented) in pretty much any ordinary code, and certainly not in cases where the result of just leaving it undefined would do the right thing.

:slight_smile: I’m talking about not my code.

Please take into account that len(a) isn’t necessarily needed for list(a), is it?

Hi Alexander, it’s possible to adjust your code a couple of different ways and avoid the NotImplementedError.

class A:
    def __iter__(self):
        for i in range(8):
            yield i
    def __len__(self):
        raise NotImplementedError
        
        
list(iter(A()))


[x for x in A()]
1 Like

The comment here argues for not implementing __len__, which seems like a perfectly sound argument here and matches the advice you’re being given. And it’s what the code you quoted does. So what’s the problem?

1 Like

How is raising NotImplementedError here different from raising a TypeError? That’s how Numpy behaves when calling len() on a 0-d array (and the same exception as calling, e.g., len(4)).

Please take into account that len(a) isn’t necessarily needed for list(a) , is it?

If the __len__ is there, Python reasonably assumes the length is known, and uses it. This enables a much faster allocation of a known finite amount of memory on the stack, instead of consuming the entire iterator, then copying it all afterwards to the memory region only once it can be allocated in bulk, or having to construct the list machinery around the cached iterator items on the fly only as they’re yielded.

Python should not compromise performance (in construction of a fundamental data structure) for the edge case of an method implementation, that returns NotImplemented.

Use one of my two work arounds, just don’t implement the method at all, or make it an abstract method.

Paul, thanks for reply!

Well if we read NotImplementedError as “is not applicable” rather than “is not implemented, and is unknown if applicable” - so it’s not a problem :slight_smile:

I supposed: the NotImplementedError/NotImplemented - “is not implemented, and is unknown if applicable”; the method absence - “is not applicable (don’t rely on)”.

James, thanks for reply!

Yes but

If the __len__ is there, Python reasonably assumes the length is known,

Okay, __len__ is there but the length is unknown.

Well as I said above, NotImplemented doesn’t mean “not implemented”, right… Okay, it’s just a terminology question.

If I understood correctly, NotImplemented can be understood to mean “not this way, try the other way around”. In the case of binary operators it could mean that a+b doesn’t work for type(a) not having __add__ defined, but perhaps b+a works.

In the case of list(a), having

def __len__(self):
  return NotImplemented

can serve to tell list the same “try the other way”.

On the other hand NotImplementedError, for what I understand, means either “not supported”, “perhaps supported in the future”, or perhaps “support it in a derived class”.

Having list(a) fail, when a is of a type for which __len__ raises NotImplementedError can preserve this meaning. Namely, that one should provably be inheriting from type(a) and implementing __len__.

1 Like

Just after the described behavior is described in the documentation :slight_smile:

https://docs.python.org/3/library/stdtypes.html#list:

Lists may be constructed in several ways:

  • Using the type constructor: list() or list(iterable)

iterable may be either a sequence, a container that supports iteration, or an iterator object.

https://docs.python.org/3/glossary.html#term-iterable:

iterable:
An object capable of returning its members one at a time. <…> and objects of any classes you define with an __iter__() method or with a __getitem__() method that implements sequence semantics.

The documentation of list could be more explicit on its use of __len__. Also, the “or” in the documentation of iterable, makes the sentence ambiguous (at least to me). It is not clear (to me) if the subordinate clause “that implements sequence semantics” applies to the subject of the whole sentence or to the second alternative of the “or”.

However, going one level deeper, to the meaning of “sequence semantics”, it does say in sequence, that

An iterable which supports efficient element access using integer indices via the __getitem__() special method and defines a __len__() method that returns the length of the sequence.

3 Likes

@franklinvp , I really appreciate your understanding of me.

Yes, I had read it as such as you wrote in your fresh edit :slight_smile:

| Franklinvp
March 7 |

  • | - |

In the case of list(a), having

def __len__(self):
  return NotImplemented

can serve to tell list the same “try the other way”.

I think a problem with this is that the C-level type slot for len returns an integer, not an object, so it can’t return NotImplemented.

BTW, another way to signal that an operation isn’t provided is to set the method to None, and that seems to work here as well:

class A:
  stuff = [1, 2, 3]
  def __iter__(self):
    return iter(self.stuff)
  __len__ = None
a = A()
for x in a:
  print(x)
% python3 unimp_len.py
1
2
3
2 Likes

Interesting. Then, it was only accidental that list(a) worked. The len(a) throws a TypeError and perhaps that is the type of exception that list is expecting.

1 Like

Indeed, as I was saying above. Since TypeError is what len() raises on types without __len__, and an iterator is such a type, that’s what’s usually used to signal that len() is not supported even though __len__ exists.

1 Like

So, @franklinvp , what’s your opinion?

and objects of any classes you define with an __iter__() method or with a __getitem__() method that implements sequence semantics.

objects <...> that implements - a grammatical error. Yes?

So, implements applies to the second alternative of the or. Yes?

Syntactic ambiguity is not, necessarily, a grammatical error. It is only a property that can make the sentence hard(er) to understand. Sometimes the context, other sentences around, clears the ambiguity. My only opinion is that that sentence in the documentation

[…] objects of any classes you define with an __iter__() method or with a __getitem__() method that implements sequence semantics.

is ambiguous. Opinions on which interpretation of the sentence is the one intended, I don’t have any.

I disagree. As written, the that in

and objects of any classes you define with an __iter__() method or with a __getitem__() method that implements sequence semantics.

unambiguously refers to method, not objects. If the intended reference is in fact objects, that is a documentation bug.

1 Like