Array slicing notation to get N elements from index I

awjlogan · April 24, 2023, 3:16pm

Hi - my first post here I’ve had a search through PEPs, forums, and at work but haven’t been able to prove the negative that this syntactic sugar doesn’t exist.

Frequently I need to get an array slice in the form:

arr[start_idx:(start_idx + some_length)]

The repetition of the start_idx variable and the variable some_length seem a bit awkward and ends up with some bracket noise at the end of the statement. For the same transformation, SystemVerilog has the following syntax:

arr[start_idx +: some_length]

In the form of the question title, this would be : arr[I +: N]

(for completeness, SV has the -: operator to give the reversed slice). This seems to me to be a concise method of doing this operation. Others at work came up with the following Python solution:

arr[start:][:some_length]

which is nice, but I think this creates a copy of the array first, which might not be desirable and also the :][: might not be obvious on scanning this.

It is very unusual that SystemVerilog has nicer syntax than something like Python, so wondered if anyone had any thoughts they’d like to share on this? Hope this is of interest.

gkb · April 24, 2023, 6:28pm

You could solve this with a helper function:

def sl(i, l):
    return slice(i, i+l)

print("abcdefghijk"[sl(3,4)])
# prints defg

awjlogan · April 24, 2023, 7:36pm

Agreed, thanks - but that’s a helper function that has to be declared in every project that uses it. Further, and I think more importantly, it is not obvious looking at the usage what it is actually doing - reading it means another redirection to see what sl is doing.

pf_moore · April 24, 2023, 7:48pm

It’s no less obvious than n+:i. Quite the opposite - just because the “short” name sl is unhelpful, doesn’t mean you can’t find a better name.

And “you have to add the function to each project”. Well, yes, but it’s a one-liner, and as has been said many times, “not every three-line function deserves to be a builtin”. Much less, syntax.

abessman · April 24, 2023, 8:17pm

+1. [start : start + n] is not very DRY.

Having had no prior exposure to the proposed syntax, I find n+:i to be quite intuitive, via analogy with +=.

Perhaps a good fit for itertools, then?

Rosuav · April 24, 2023, 9:34pm

It’s not really an itertools thing, but you can have your own personal library with whatever you want in it. I usually give mine a really boring name like “utils” or, if I’m feeling really namespacey that day, something derived from my own name (“rosutils” or something).

chepner · April 24, 2023, 9:50pm

For what it’s worth, the parentheses aren’t necessary. The colon is part of the slice syntax, not an operator with a higher precedence than +.

arr[start_idx:start_idx + some_length]

pochmann · April 25, 2023, 1:10am

And PEP 8 asks for treating the colon like an operator with the lowest priority and prefers spaces accordingly:

# Wrong:
arr[start_idx:start_idx + some_length]

# Correct:
arr[start_idx : start_idx+some_length]
arr[start_idx : start_idx + some_length]

ajoino · April 25, 2023, 6:58am

I’ve seen a lot of DRY being mentioned here recently, often over small things like this. I thought DRY was mostly about not repeating large chunks of code because it makes it easier to maintain, not about removing every single token that is arbritrarily close to the same token. If being DRY is your thing, APL might be a better fit than Python

abessman · April 25, 2023, 9:42am

There are degrees to everything. [start : start + n] is not sopping wet, but still less dry than [start+:n]. myvar = myvar + 1 is perfectly fine, but I think most people still prefer myvar += 1.

Really, the case for +: is the same as the case for +=. What arguments are there for the existence of += that do not apply to +:?

ajoino · April 25, 2023, 10:11am

I don’t know, DRY to me seems like a mantra that doesn’t really help. I think the current way of stating this is crystal clear and requires no explanation beyond how slices work. That to me is much more important than saving 10-ish characters, which you can do with a trivial convenience function as stated above. Even if you have rewrite it in all the libraries you use, the cost is minimal.

oscarbenjamin · April 25, 2023, 10:33am

The fact that += is a much more common operation is an argument for considering these differently. It is good to have terse syntax for things that are used a lot and that programmers are likely to become familiar with early on. The use of += for in-place addition is also very prevalent in other languages so many programmers coming to Python from other languages are likely to understand approximately what it means immediately.

The += operator naturally generalises to all binary operators -=, *=, &= etc and so a single syntactic construct can be learned that has many possible applications. If we allow +: then immediately the question will be what about -:, &: etc but the usefulness of the syntax does not extend to these cases.

Rosuav · April 25, 2023, 10:35am

Frequency of use, mainly Incrementing a number or extending a sequence is extremely common, but this kind of slicing isn’t nearly as much so.

Another argument to consider is how frequently you’d want to do this with something that isn’t just a simple variable. When you do something like stuff["whatever"].value[3] += 1 the majority of the left hand side is only evaluated once. That’s certainly possible with the proposed syntax, but I’m not sure how often it’d happen.

awjlogan · April 25, 2023, 10:36am

I didn’t really present this from a DRY perspective (which I take, as Jacob said, to be more like large chunks of code) - more that the repetition of the starting index makes (possibly subjectively) the interpretation harder: longer statement, need to mentally check what the repeated variable means, and the potential bracket noise to make clear the intent of the precedence (Stefan’s pointer to PEP8 is interesting, although I don’t remember a linter picking that one up).

The analogy to += I think is the most relevant one here - that is a shorthand for a common operation and both the intent and result are immediately obvious (no need to reparse the repeated variable name). The helper function with slice adds another level of indirection to determine what the index boundaries are - given how many logical errors come from index bounds, anything that makes the bounds and/or range easier to determine is helpful to the user.

awjlogan · April 25, 2023, 10:40am

The += operator naturally generalises to all binary operators -=, *=, &= etc and so a single syntactic construct can be learned that has many possible applications. If we allow +: then immediately the question will be what about -:, &: etc but the usefulness of the syntax does not extend to these cases.

This is a fair point -: does have meaning in SystemVerilog (reverses the order of the N elements starting from index I), but that isn’t immediately obvious (I don’t think, at least).

chepner · April 25, 2023, 10:43am

In-place modification of mutable objects.

abessman · April 25, 2023, 11:14am

Fair. Although, I would personally love if I could do slicable_thing[start /: n] and get n evenly sized chunks of slicable_thing Yes, I realize there are a bunch of cases where that would break down.