Cartesian product for ranges / lists / iterables using @

This idea was inspired by the Range literals discussion, where João Bernardo Oliveira suggested this as a use case. However, I think Cartesian product of ranges is of general interest and could be implemented already now with the existing syntaxes.

Current State

Sometimes one need to iterate over a grid. This could be a matrix, data table, game map, etc… grids are used in many fields. To iterate over two dimensional grid one can use two nested loops. inside comprehensions, they can be put on a same line

[do_something(x, y) for x in range(width) for y in range(height)]

but outside comprehension, each loop have to be placed on a separate line

for x in range(width):
    for y in range(height):
        do_something(x, y)

Iteration over nested grids looks like this

for x in range(width):
    for y in range(height):
        # iterate over neighbours
        for dx in range(-1, 2):
            for dy in range(-1, 2):
                do_something(x+dx, y+dy)

Sure, this can handled with defining some separate iterators, but maybe general approach is possible?

Suggestion

Allow using @ (which is already used for matrix multiplication) to define the Cartesian product of the ranges, so the previous code is shortened to

[do_something(x, y) for x, y in range(width) @ range(height)]

for x, y in range(width) @ range(height):
    do_something(x, y)

for x, y in range(width) @ range(height):
    # iterate over neighbours
    for dx, dy in range(-1, 2) @ range(-1, 2):
        do_something(x+dx, y+dy)

This saves a level of indentation, makes the comprehension and for-loops consistent with comprehensions, makes iteration over three-dimensional grid feasible and four-dimensional grid almost nice (5+ dimensions would require completely different approach anyways).

Generalisation to lists or all iterables?

One can extend this idea to all things iterable, or at least to finite iterables such as lists, dict_keys and sets… I’m not sure about that, as it would have larger consequences. But at least with ranges, I see an immediate use for myself.

There is already itertools.product.

To be honest, I have not seen it used a single time in production code. I don’t think this deserves special syntax.

8 Likes

Is it the same as itertools.product()?

The problem with adding new methods (including special methods like __matmul__) for iterators is that iterator is a protocol, not a base class. There are many unrelated iterator classes, and you will need to add a new method for all of them.

It can also conflict with NumPy use of the @ operator.

>>> numpy.array([[1, 2], [3, 4]]) @ range(5, 7)
array([17, 39])
>>> range(5, 7) @ numpy.array([[1, 2], [3, 4]])
array([23, 34])
6 Likes

Right. And it violates the (non-normative) semantics established in PEP-465 where the operator was introduced:

  • arr(3) @ arr(3) performs the same computation as [(arr(1, 3) @ arr(3, 1))], but returns the result with shape (), i.e., a single scalar value, not in matrix form. So this is the standard inner product on vectors.

For the language to change the meaning on 1-d vectors to the outer product while all numerical libraries have agreed to implement the inner product seems guaranteed to cause grief.

1 Like

I find itertools.product a very elegant way of solving this problem, and I use it a fair bit for this very purpose. I don’t see the need for using @ here.

4 Likes

Ok, I see a lot of examples now why implementing Cartesian product with @ for all iterators generally is a bad idea. But how about using it only between ranges? This should not be problematic.

As for the itertools.product(), the same argument can be made about numpy.dot(). Introducing @ for the dot product was not implementing something new, it was not even saving a lot of keystrokes, but it still was a great addition to the language.

Because, as already said, it duplicates products() and violates the convention that @ on vectors is dot, not product. Because ranges implement the concept of arithmetic sequences, and treating them as vectors (with addition and 3 types of multiplication is not part of that concept. If you want, you could write a subclass of range, Rvec, that implements @ via __matmul__ using itertools or explicit loops.

Or write a what you really need, a Grid class initialized with tuples. Don’t limit to 2d and you could write

for x, y, dx, dy in Grid((1, width), (1, height), (-2, 1), (-2, 1)):
    do_something(x+dx, y+dy)

I was going to suggest this, but range cannot actually be subclassed:

>>> class Test(range):
...     pass
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: type 'range' is not an acceptable base type

The specific issue is discussed, along with workarounds, here: python - How do I extend, mimic, or emulate the range function? - Stack Overflow

FWIW, as someone working in the scientific Python space and in mentoring and collaborating with a variety of users from beginners to experts, and reviewing large amounts of others’ code, I don’t think I can recall a single instance where I’ve actually seen @ used in code, production or otherwise, aside from a handful of people explicitly asking about what it does. It could be used in specialized domains that I’m not aware of, but it doesn’t seem to be at all widely used. Ergo, knowing what we know in hindsight, I don’t think that PEP would have been approved.

As for this case, one more factor to consider is the impact on code readability. Multi-level for loops may take up more space, but they make very clear what is happening using a very basic, fundamental construct of the language. Invoking a special operator to do something special here (which is currently documented to do something different) makes it less clear what is happening, and is also not as discoverable for users to search or get help on (as opposed to a function or method name, like itertools.product).

2 Likes

Besides the fact that matrix multiplication being a different operation from Cartesian product that can be applied to objects of the same nature and thus conflict with each other, one other reason that motivates the introduction of the infix matrix multiplication is that matrix multiplication appears with some frequency in larger expressions that involve simultaneously multiple occurrences of the product and other operations. Do you see that happening with Cartesian product?

In the case of matrix multiplication the infix operator allows taking mathematical formulas an having them expressed in code looking nearly the same.

However, if the worst that you expect to see with Cartesian product is to see it applied with itself multiple times A\times B\times C\times D, not much is gained since it is not much different from product(A,B,C,D).

Compare with

(Hβ−r )^T(HVH^T)^{ − 1}(Hβ− r )

becoming

(H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)

vs

(H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)

Putting aside the fact that a direct reading of a mathematical formula might not always be the most efficient algorithm.

1 Like

Often true, and probably never more so than this formula for the Nth prime number. Elegant algebraic expression and efficient computational algorithm do not always correlate!

Well, when the example is not esoteric is when it is more likely to see it in code. Some that come to mind are as simple as

  • evaluating polynomials a_0+a_1x+a_2x^2+...+a_nx^n,
  • or using Cardano’s formulas to solve cubics,
  • or the very linear algebra problem that appeared above, Ax=b being solved as x=A^{-1}b.

A simple example I could come up with:
Let A and B be (n,n)-matrices and v be n-array. Then, (A @ (B @ v)) has n^2 order mutiplications, but ((A @ B) @ v) has n^3. :slightly_smiling_face:

Getting off-topic, but for what it’s worth I use it regularly for composing affine transforms, and when teaching statistics. Do you find people preferring to use numpy.dot()/numpy.ndarray.dot() or just not really encounter the need to use that operation?

Here are about 215 occurrences in the Jax codebase.

FYI this is all going to change in the array API: vecdot replaces dot and doesn’t do matrix multiplication; matmul and tensordot are unchanged; for everything else, there’s einsum.

2 Likes

Very likely they meant @ as the infix version of numpy.matmul rather than its use for decorators.

That search should filter out decorators. On the command line, I get 215 non-decorator hits.

Construct a set using a range literal and overload @ on the set

I’ve used it, but I also never felt the need for special syntax. product(x, y) is more readable than x @ y since @ isn’t reminiscent of any established notation. (And as someone who rarely uses NumPy, it still doesn’t suggest matrix multiplication to me strongly, either. @ was chosen more because the NumPy community wanted some infix operator, and there weren’t many ASCII characters left to choose from.)

1 Like