Slicing Notation with Deque and Multi-Dimensional Slicing on Data Structures

jacob · December 3, 2019, 9:00pm

There are numerous sources summarizing python list slice notation (e.g. here), at least one good source on how to slice a deque, but I cannot find any source on why python does not include multi-dimensional slicing within its standard library or allow slicing notation with a deque.

My dream would be to implement a multidimensional circular buffer (a queue of lists), and access any part of the buffer efficiently and pythonically as illustrated below:

from collections import deque
from random import random

stack = deque([], maxlen=20)
# comb over a bunch of random data and FILO to the stack
[stack.append([i, random()]) for i in range(1000)]

print(stack) # everything is good so far, "2D" deques are allowed!

try: # it would be cool if I could do this
    print("Some interesting part of the stack:", stack[3:8])
except TypeError: # but, alas, I cannot
    print("Nope, deques can only take an index not slices.")

try: # and even cooler if I could do this
    print('Some interesting part of a "column" of the stack:', stack[3:8,1])
except TypeError: # though I definitely cannot do this, not even with a list
    print("Get real, this isn't even allowed with normal python lists.")

This raises two questions for me: First, why isn’t “2D” list slicing available for python lists? (I realize that I can use numpy and numpy arrays, but (at least to me) this seems like something that should be in the standard library.) Second, why isn’t list slicing notation of any kind allowed on a python deque?

In summary, I want:

interesting_part = stack[3:8]

rather than

from itertools import islice

interesting_part = deque(islice(stack, 3, 8))

and

interesting_sub_part = stack[3:8,1]

rather than

from itertools import islice

interesting_part = deque(islice(stack, 3, 8))
interesting_sub_part = deque([row[1] for row in interesting_part])

pitrou · December 4, 2019, 10:39pm

cc @rhettinger

mjpieters · December 5, 2019, 8:06pm

You are always free to implement your own __getitem__ hook, Python fully supports passing in multiple slices and indices:

>>> class SliceDemo:
...     def __getitem__(self, idx):
...         print(f"Indexed with {idx}")
...         if isinstance(idx, tuple):
...             for i, item in enumerate(idx):  print(f"{i}: {item}")
...
>>> sd = SliceDemo()
>>> sd[3:8, 1]
Indexed with (slice(3, 8, None), 1)
0: slice(3, 8, None)
1: 1

There are just no types in the standard library where this makes sense.

The deque type is a linked list and indexing past 0 and -1 is inefficient (read, takes O(N) time), so slicing those doesn’t make much sense. And what type of object would the slice produce? Is that another deque? Remember that slicing a list gives you a list, slicing a string gives you a string, etc.

steven.daprano · December 5, 2019, 8:55pm

“The deque type is a linked list and indexing past 0 and -1 is
inefficient (read, takes O(N) time), so slicing those doesn’t make
much sense.”

The travelling salesman problem is a classic example of a problem where
the known solutions are all very inefficient. Would you conclude that
because the algorithms are slow, it “doesn’t make much sense” to solve
the problem of finding the shortest path?

The implementation or efficiency doesn’t come into it. A deque is a
list-like sequence: if somebody needs elements 5:12 (say) of a sequence,
and it makes sense for a list or a tuple or an array, then it makes
sense if they are passed a deque as well, even if the deque is a little
bit slower.

If deques are linked lists, as you say, then implementing a slice in the
deque class can be more efficient than doing it by hand. Doing it by
hand guarantees that each access has to start from the beginning of the
linked list:

elements = [mydeque[i] for i in range(5, 12)]
result = deque(elements, mydeque.maxlen)

but the deque class itself can walk the list and extract the needed
elements in a single pass. So if you are worried about efficiency, it
makes sense to implement slicing in the deque class.

mjpieters · December 5, 2019, 10:25pm

Your list comprehension is indeed highly inefficient, but there is no need to use indexing. Just iterate over the deque instead. Use islice(), as Jacob already demonstrated:

elements = islice(mydeque, 5, 12)
result = deque(elements, mydeque.maxlen)

The deque iterator has the same internal access to the linked list.

You can do it just as efficiently by rotating the deque (unless you need to use the deque concurrently). Rotate 5 steps, copy value, rotate 1 step, copy, (repeat until done), rotate 12 steps in the opposite direction.

And efficiency absolutely comes into play; the deque documentation is quite explicit about this:

Indexed access is O(1) at both ends but slows to O(n) in the middle. For fast random access, use lists instead.

Different data structures have different pros and cons, deque random access is inefficient and you may as well use a list instead. If you need to be able to grow a list by appending on both ends, wrap two lists in a custom container and treat one as a reversed first half, for example.

jacob · December 6, 2019, 5:20pm

Thanks for the insights, @mjpieters! I have been implementing various custom multi-dimensional slicing methods for some time now, though I have never thought to modify the object type’s own __getitem__ attribute through a class , so thank you for this. I’m pretty new to programming, and your example really cemented some concepts for me. I built the below code to play around with:

# needed imports
from collections import deque
from itertools import islice

# imports to illustrate my point
from random import random
from timeit import timeit

class CustomDeque(deque):
    
    def __init__(self, *args, **kwargs):
        """ 
        
        My deque class initializes from the standard class to get all
        the sweet optimization and attributes of a standard deque
        
        """
        super(CustomDeque, self).__init__(args[0])

        
    def __getitem__(self, idx):
        """ 
        
        Define my own __getitem__ attribute to accept slices:
        
            if slice is really just an index, default to standard __getitem__
            
            if slice is "single dimensional", nuild a returnable deque
            by passing slicing parameters into islice
            
            if slice is "multi-dimensional" the "for" through the
            sliced deque to get only the column of interest and return
            
        """
        if isinstance(idx, int): # only need single index, revert to stdlib
            return super(CustomDeque, self).__getitem__(idx)
        elif isinstance(idx, slice): # use islice and return as deque
            return deque(islice(self, idx.start, idx.stop, idx.step))
        elif isinstance(idx, tuple): # there's more than just a slice
            return deque([row[idx[1]] \
                              for row \
                              in islice(self,
                                        idx[0].start,
                                        idx[0].stop,
                                        idx[0].step)])
            

custom_stack = CustomDeque([], maxlen=20) # create a custom deque
standard_stack = deque([], maxlen=20) # create a standard deque


for i in range(100):
    # comb over random data and FILO identical data to each deque
    row = [i, random(), random(), random()]
    custom_stack.append(row)
    standard_stack.append(row)

test_cnt = 1000000

# print average execution times for various slicing operations
print('\n')
print('standard index:',
      timeit("standard_stack[6]",
             globals=globals(),
             number=test_cnt) / test_cnt)
print('custom index:',
      timeit("custom_stack[6]",
             globals=globals(),
             number=test_cnt) / test_cnt)

print('\n')
print('standard slice:',
      timeit("deque(islice(standard_stack, 3, 8))",
             globals=globals(),
             number=test_cnt) / test_cnt)
print('custom slice:',
      timeit("custom_stack[3:8]",
             globals=globals(),
             number=test_cnt) / test_cnt)

print('\n')
print('standard 2D slice:',
      timeit("deque([row[2] for row in islice(standard_stack, 3, 8)])",
             globals=globals(),
             number=test_cnt) / test_cnt)
print('custom 2D slice:',
      timeit("custom_stack[3:8, 2]",
             globals=globals(),
             number=test_cnt) / test_cnt)

This outputs the following timing results:

standard index: 3.208229999290779e-08
custom index: 3.5772309999447314e-07

standard slice: 2.502960000419989e-07
custom slice: 6.012820000178181e-07

standard 2D slice: 5.054112999932841e-07
custom 2D slice: 1.1654833999928087e-06

My custom deque implementation is an order of magnitude slower than if I pepper my code with a bunch of deque([row[idx] for row in islice(my_deque, start, stop)]) statements, which would quickly degrade readability and maintainability. Which brings me back to my concern, I have always had a nagging suspicion that I am deploying a sub-optimal implementation. The benefit of putting something in the standard library is consensus and significant peer review. I can then implement my slicing and rest assured that it is probably the best possible way of doing it with the present language, all the while producing readable and maintainable code.

For context, I am integrating Python with data acquisition, and then making automated decisions based on data analysis (e.g. data smoothing and discrete calculus to find data peaks, etc). These automated decisions culminate in actuation of devices in the physical world (e.g. relays, solenoids, etc). My deques are FILO’d with the data window under analysis at any given time. The “two dimensions” comes from multiple data acquisition channels. I’d like for all this to happen as close to real time as reasonably achievable. I am also fairly new to programming, which is why I would like to leave the optimization to all the smart and talented folks in core development (I aspire to be one of these folks one day).

I think I am caught in a trade-off with lists or deques. If I use lists, I have more efficient random access, but then I would need to define my own pushing and popping routines. If I use deques, I have less efficient random access, but I have optimized my queue functionally.

An obvious question would be: why don’t I use a “faster” language? The answer is that, for various reasons, my human-machine-interface (HMI) is a common web browser. I serve the machine controls and data insights to a browser via JavaScript, HTML5, and Python flask. I have yet to find a language better suited for melding web development and back end data science under one roof (though again, I am relatively inexperience, if you know of one, let me know).

mjpieters · December 6, 2019, 7:35pm

Your interpretation of what these time trials mean is flawed, I’m afraid.

Compared to your code operating on the standard deque object, you are effectively measuring the cost of:

calling a Python function
looking up the global objects isinstance and int
executing isinstance()

and then, focusing on the single item case:

looking up the super and CustomDeque globals
creating a super() object
looking up the __getitem__ attribute on the super() object

These are all fixed, and relatively small costs. These numbers are not going to grow as your deque contents grow, for example.

Your tests are not an order of a magnitude slower here, they range in the tens-to-low-hundreds of nanoseconds slower, a price you’d pay for any minimally complex Python method. Your standard indexing timings come down to 32 nanoseconds versus 350 nanoseconds, and the ‘standard slice’ is 25 versus 60 nanoseconds. Unless you are executing these operations in a tight loop, hundred-thousands of times, you are not even going to see a real difference.

You can micro-optimize some of the code. For example, if you only support int and slicing you could use is on the result of type(); slice can’t be subclassed anyway and unless you need to support named tuples I’d be very surprised if your code would ever encounter a tuple subclass:

    type_ = type(idx)
    if type_ is int:
        return super().__getitem__(idx)   # Python 3 lets you use super() without arguments
    elif type_ is slice:
        return deque(islice(self, idx.start, idx.stop, idx.step))
    elif type_ is tuple:
         return deque([...])

Other options are to not support cooperative subclassing (so use return deque.__getitem__(self, idx) instead of using super()), you could cache globals as locals (using ‘private’ argument names, so def __getitem__(self, idx, _type=type, _int=int, _super=super, _slice=slice, _tuple=tuple, _deque=deque), locals are faster to resolve). For the tuple case you could avoid repeatedly using idx[0] and idx[1] by caching the outcome of those expressions. You could use try: ... except around the common single-instance case and only test for alternative idx types in the exception handler.

Last but not least, you could use cython to implement the method, and so get native code speeds.

You also say:

Then you are using the wrong programming language if costs like these are going to bother you. Python is not going to scale for tasks that require ‘close to real time’ responses, anyway.

jacob · December 6, 2019, 8:23pm

Excellent feedback, @mjpieters. This conversation has opened a lot of avenues for me to explore! Thanks for your time.

I based my claim on the logarithmic definition. A change from 10 nanoseconds to 100 nanoseconds is a factor of 10, or an order of magnitude (i.e. 10e-09 to 10e-08 is one order of magnitude increase or 10%). Though I agree with you, my code leaves a lot to be desired. This is precisely my point. Why leave all this customization and optimization up to each individual programmer when it could be done in the standard library with peer review and consensus?

pf_moore · December 6, 2019, 8:41pm

The usual answer to this question is that standard library code has to be optimal for all use cases, and so makes compromises that might be detrimental for your particular use. On the other hand, if you write your own code, you can optimise and customise it in precisely the way that works best for your code, getting the best possible result.

It’s certainly true that programmer time is often more important than such optimisations, so a standard “best for everyone” solution is often preferable to a custom solution simply because grabbing a stdlib function takes essentially no development time. But it is a trade-off, and the best decision isn’t always in favour of adding a stdlib function.

jacob · December 6, 2019, 9:16pm

@pf_moore, thanks for the input. I agree. My question then would be do you think enough people routinely do multi-dimensional slicing to warrant a change? What I am describing here is really two changes:

Support multi-dimensional slicing notation of lists within the stdlib.
Generalize list slicing notation to deques (I realize this is probably a harder sell).

thomas · December 6, 2019, 9:35pm

The thing to realise here is that there are no multi-dimensional lists in the standard library. Lists only have one dimension. You can have nested lists, but it doesn’t make sense for operators or methods on the outer list to act on inner lists as well – because they don’t have to be lists, and they don’t all have to be the same type, and they could all behave differently even when they are list subclasses. If you really wanted this, you’d have to write a function for it, not a method. Or, use an actual multi-dimensional data structure, like numpy arrays.

jacob · December 6, 2019, 9:42pm

This makes a lot of sense @thomas. I now see why this is not a good idea within the stdlib. Thanks everyone for all the great feedback! I have learned a lot.

jacob · December 6, 2019, 11:07pm

@mjpieters, FYI: I tried various versions of my CustomDeque class while incorporating your optimization suggestions. My custom __getitem__ method is on the same order of magnitude as the single line islice statement for timing. Of course traditional deque index retrieval is still, and always will be, faster than my custom method. I incorporated all your suggestions aside from the try: common case; except TypeError: test for various odd idx types, which ended up slower with all the the other optimizations present.

Thanks again! HUGE help and advancement in my learning!

storchaka · December 12, 2019, 9:02am

For slicing support in deque see https://bugs.python.org/issue17394. There is a ready patch, but it adds so much complexity, that it was decided that the feature is not worth it.