Advanced slicing rules?

Mitzleplick · August 14, 2022, 9:34pm

So I’m taking a quiz and the first question asks something that was never written or spoken once in the section on lists as they pertain to slicing… which is absolutely infuriating as I am someone who tries to understand EVERYTHING before moving on:

Question 1:
What are the values of list_b and list_c after the following snippet?

list_a = [1, 2, 3]

list_b = list_a[-2:-1]

list_c = list_a[-1:-2]

To me this answer would be:
list_b = [2,3]
list_c = [3,2]
… but that is not one of the answer given:

Incorrect answer. Please try again.

list_a[-2:-1] means “start at the last but one element (inclusive) and go until the last element (exclusive)”, which essentially means “take the second element from the end”. The other slice, [-1:-2], doesn’t make sense because it starts with a higher index and finishes with a lower one, so it returns an empty slice.

So apparently the higher index, say “4” out of indexes 0,1,2,3,4 can NEVER be placed before a lower index like “2” in a result? What kind of rule is that? What is the purpose of that rule? What other nonsense rules are out there I need to be aware of “4 dimensional nested loops only work if the unicode characters are entered during a waxing moon phase while eating a peanut butter and jelly sandwich?”

rob42 · August 14, 2022, 10:09pm

If you have a little think about it, that’s right: you can’t start at 4 and finish at 2, any more than you can start at -1 and finish at -2

 position:   |  0  |  1 |  2  |  3  |  4  |
 data:       |  a  |  b |  c  |  d  |  e  |
- position:  | -5  | -4 | -3  | -2  | -1  |

Easier to see with letters:

list_a = [a, b, c]

 position:  |  0  |  1 |  2  |
 data:      |  a  |  b |  c  |
- position: | -3  | -2 | -1  |


list_b = list_a[-2:-1] # [b, c]

list_c = []            # [-1:-2]

MRAB · August 14, 2022, 10:09pm

The full form of slicing is a[start : end : step].

If the step is positive (it defaults to 1 if omitted), you go in ascending index from start up to, but excluding, end.

If the step is negative, you go in descending index from start down to, but excluding, end.

Including the start and excluding the end is called “open interval”, and using it makes code simpler most of the time because there’ll be fewer places where you have to add or subtract 1.

steven.daprano · August 15, 2022, 12:33am

When you take a slice, what the interpreter does is something very similar to a loop:

current index = start index
while current index < end index:
    copy the item at the current index
    add 1 to the current index

Consequently, if the starting position is already greater than the ending position, the loop ends immediately and no items are copied.

The implication of that is that the starting index has to be to the left of the ending index, or you will get an empty result.

One easy way to think of slicing is that the positions are between the items. So your list_a = [1, 2, 3] might be labelled with positions like this:

    positions:  0   1   2   3
    values:     | 1 | 2 | 3 |

We can use negative indices to label all positions except the final one:

    positions:  0   1   2   3
    values:     | 1 | 2 | 3 |
    -ve pos:   -3  -2  -1

You will note that we get the regular zero-or-positive index position from the negative position by just adding the length of the list to it.

Using the rule we established earlier, we can say that slicing proceeds from left to right, and the starting position must be to the left of the ending position. So if you slice starting at -2 and ending at -1, you get a single slot, containing 2, so the result of list_a[-2:-1] is a list with one value, [2].

But if you try to slice starting at -1 and ending at -2, the end position is to the left of the start position, so we get no slots, and the result of list_a[-1:-2] is just the empty list [].

Or in other words, the stopping condition “stop when you reach or exceed index -2” is true immediately.

Remarkably few. And like this one, even if they seem like nonsense at first glance, they probably aren’t.

komoto48g · August 16, 2022, 1:48pm

Borrowing Steven’s explanation,

    positions:  0   1   2   3   4   5
    values:     | a | b | c | d | e |
    -ve pos:   -5  -4  -3  -2  -1

slicing in normal order is easy to understand.

>>> text = "abcde"
>>> text[0:4]
'abcd'

But to understand slicing in reverse order such as,

>>> text[4:0:-1]
'edcb'
text[-1:-5:-1]
'edcb'

I think it is better to explain as follows:

    positions:  | 0 | 1 | 2 | 3 | 4 |
    values:     | a | b | c | d | e |
    -ve pos:    |-5 |-4 |-3 |-2 |-1 |

Mitzleplick · August 17, 2022, 1:10pm

Please correct me if I am wrong in this statement, but slicing can be thought of as a window into a specific section of data and since it is only a “view only” action, it cannot re-arrange the data into something new (immutable). This is why pulling a lower value negative slice (-1) cannot be followed by a higher value negative slice (-3), because the view would require the data to be displayed in a way that is not true to the original format of what is being sliced.

John_Carter · August 20, 2022, 3:09pm

Hi Brad, Just to confuse matters read up on the built in class - slice at
Builtin Functions.
Basicaly, given

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
a[3:4] == a[slice(3, 4)]

Now the docs say slice is equivalent to the indeces generated by range([start, ]stop[, step])
we can see these by

print(list(range(8, 3, -1))) # [8, 7, 6, 5, 4]

list() in the line above is used to convert the indeces generated by range so we can print them.
so

print(a[7:2: -3]) # [7, 4]

So yes, slicing can be thought of as a view but it actually can generate a new list

b = a[7:2:-2]
print(b) # [7, 5, 3]

So the idea that the order of a slice can not be changed is wrong
John

Mitzleplick · August 24, 2022, 4:09pm

So just to be clear if I type this as [8:3:-1] it is interpreted as “Start at index 8, Stop at index 3, and move negatively in increments of one index at a time”? If you had not specified the -1 it would have errored because it cannot move in a positive direction (the default direction) from index point 8 and reach index point 3, is that correct?

So we use your example:

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
b = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

If we wanted to check that variable b begins with the same characters as variable a ends with could we do something like:

if a[len(a): :-1:] == b[0:len(b)]:
    return True
else:
    return False

MRAB · August 24, 2022, 6:56pm

range wouldn’t error, it would return an empty range:

>>> list(range(8, 3, -1))
[8, 7, 6, 5, 4]
>>> list(range(8, 3, 1))
[]

Your second example could be shortened to:

return a[ : : -1] == b

Mitzleplick · August 25, 2022, 5:21pm

Are there some unwritten rules pertaining to slicing under a function that would yield different results than using print()?

I have a training exercise:

Complete the solution so that it returns true if the first argument(string) passed in ends with the 2nd argument (also a string).

Examples:
solution('abc', 'bc') # returns true
solution('abc', 'd') # returns false

In testing using print() I’m able to get the correct result but when I plug it into the program is fails the test:
My test:

string = [0,1,2,3,4,5]
ending = [3,4,5]

if string[len(ending)::] == ending:
  print('true')

Result: True

Since the program uses a function I have to modify it to “return” instead of print() but for some reason it is not passing:

What I’m entering:

def solution(string, ending):
    if (string[len(ending)::]) == ending:
        return True
    else:
        return False

The test uses:

test.assert_equals(solution(‘abcde’, ‘cde’), True)
test.assert_equals(solution(‘abcde’, ‘abc’), False)
test.assert_equals(solution(‘abcde’, ‘’), True)

The result shows:

Results:

False should equal True

Test Passed

False should equal True

So the only thing I can think is that slicing under a function has different rules…

BowlOfRed · August 25, 2022, 7:16pm

Your algorithm is incorrect and your test case is insufficient (it happens to work for that one, but not for others). Let’s make your ending slightly shorter:

string = [0,1,2,3,4,5]
ending = [4,5]

if string[len(ending)::] == ending:
  print('true')
else:
  print('false')

false

Is that what you expect?

aivarpaalberg · August 25, 2022, 8:05pm

Brad Westermann:

I have a training exercise:
Complete the solution so that it returns true if the first argument(string) passed in ends with the 2nd argument (also a string).

Examples:
solution('abc', 'bc') # returns true
solution(‘abc’, ‘d’) # returns false

Exercise states clearly that these are strings and in Python there is str.endswith method. So addressing the excercise, not slicing:

>>> 'abc'.endswith('bc')
True
>>> 'abc'.endswith('d')
False

Regarding lists/tuples: these don’t have endswith method, so one way to check ending with any iterable is starting from end, using zip with short-circuiting all::

>>> sample = [0,1,2,3,4,5]
>>> end = [3, 4, 5]
>>> all(x == y for x, y in zip(reversed(sample), reversed(end)))
True

abessman · August 25, 2022, 8:39pm

That seems unnecessarily complicated. Why not make use of negative indices, as discussed above?

>>> sample = [0,1,2,3,4,5]
>>> end = [3, 4, 5]
>>> sample[-len(end):] == end
True

Mitzleplick · August 25, 2022, 9:19pm

Aivar Paalberg:

Exercise states clearly that these are strings and in Python there is str.endswith method. So addressing the excercise, not slicing:
>>> 'abc'.endswith('bc')
True
>>> 'abc'.endswith('d')
False

I was unaware of str.endswith method. Thank you for letting me know about this.
Since the variable could be any combination of strings I would need to lookup how to stick that to a variable. I keep getting confused on whether to put things before or after the variable.

OMG!! I tried the negative in every spot except in front of the length… that fixed the first two, thank you.

The last test is running it against an empty string…I passed that test by accident once while playing around with it so hopefully I can find the deeper meaning behind why that would return True and maybe add an elif.

MRAB · August 26, 2022, 12:19am

There’s a gotcha there: if end is empty, then sample[-len(end):] is sample[0:], or sample, and it’ll be true only if sample is also empty.

Mitzleplick · August 26, 2022, 1:36am

Final solution I went with was (passed all the extra tests as well):

def solution(string, ending):
    if (string[-len(ending)::]) == ending:
        return True
    elif (string[:-len(ending):]) == ending:
        return True
    else:
        return False

They also show you other solutions once you finish yours…

def solution(string, ending):
    return string.endswith(ending)

Now I see what you were talking about and how it should have been assigned to the variable. Much more simple and Pythonic! (hope I’m using that right lol)

MRAB · August 26, 2022, 3:04am

Using your solution:

>>> solution('ba', 'b')
True

steven.daprano · August 26, 2022, 4:47am

Unfortunately this shows the limitations of testing. Despite passing the exercise, your solution has a bug:

If the string begins with the ending, sometimes it returns True when it should return False.

Try your function:

def solution(string, ending):
    if (string[-len(ending)::]) == ending:
        return True
    elif (string[:-len(ending):]) == ending:
        return True
    else:
        return False

with these inputs:

solution('running', 'ing')     # Should return True
solution('inglorious', 'ing')  # Should return False
solution('ingham', 'ing')      # Should return False

If you try it, you will find that your solution passes the first two tests but fails the third, which is a “False positive” – it wrongly reports that ‘ingham’ ends with ‘ing’ when it should report that it doesn’t.

The problem with your solution is the second condition elif (string[:-len(ending):]) == ending which tests:

let N be the length of the given suffix (ending);
if the slice from the start of the string to N characters from the end equals the suffix, return True

In other words, if the source string starts with the suffix (ending), and the remaining bit has the same length as the suffix, then it will wrongly return True:

# Each of these should return False
solution('abcdwxyz', 'abcd')
solution('.2', '.')
solution('suffix------', 'suffix')

We can fix your solution by removing the second condition altogether:

def solution(string, ending):
    if string[-len(ending):] == ending:
        return True
    else:
        return False

The lessons here are:

Even experts can get it wrong. The exercise failed to check this case, and so wrongly accepted your buggy solution.
Tests can demonstrate the presence of bugs, but not their absence.
Only careful thought and logical reasoning can prove that code is correct.
Tests should check both:
- input which should pass does pass;
- input which should fail does fail.
Tests need to be chosen carefully!

Sometimes choosing the right tests is as much work as writing the code in the first place.

Mitzleplick · September 3, 2022, 1:37pm

Steven D'Aprano:

We can fix your solution by removing the second condition altogether:
def solution(string, ending):
    if string[-len(ending):] == ending:
        return True
    else:
        return False

when tested using this solution, I get the following error:

Expected solution(‘abc’, ‘’) to return True: False should equal True

I don’t understand why it should return True except when referencing this example below there is a “void” at the -0 index which matches the “void” in ‘’:

Steven D'Aprano:

One easy way to think of slicing is that the positions are between the items. So your list_a = [1, 2, 3] might be labelled with positions like this:
    positions:  0   1   2   3
    values:     | 1 | 2 | 3 |
We can use negative indices to label all positions except the final one:
    positions:  0   1   2   3
    values:     | 1 | 2 | 3 |
    -ve pos:   -3  -2  -1

If that is the case, how do we tailor the syntax to accept the “void” as True? …and I guess why would we ever need to do that?

MRAB · September 3, 2022, 6:10pm

When ending is ‘’, string[-len(ending):] is equivalent to string[-0:] or string[0:], which is the entire contents of string.

The general fix is string[len(string) - len(ending):] == ending, or, if you’re working with strings, string.endswith(ending).