Optimizing a list comprehension

My givens are:

  • a list of N rna sequences (in string form) W = [W0, W1, W2, …, W(n-1)]

  • a list of M queries, Q = [(i, j), (i1, j1), (i2, j2), …, (iN, jN)] (each 0 <= i & each j < N = len(W))

  • if W_i & W_j have different lengths, assume W_i < W_j

Prompt: write a function less_than(W, Q) that returns whether W_i is less than W_j in alphabetical order, for each query

currently, my code is:

def less_than(W, Q):
     return [W[i] < W[j] for i, j in Q]

the function calls I have tried with my jupyter notebook so far include:

less_than(
    ["A", "AA", "AC", "UUG", "AUG", "CGT"],
    [(0, 1), (1, 2), (2, 0), (3,4), (5,1), (4, 5), (2,4)],
)

this returned:

[True, True, False, False, False, True, True]

with a time of 946 ns ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

I have tried searching up ways to further optimize the code, such as seeing if maps or filter would optimize time complexity, but since I still need iteration and alphabetical comparison, I am not sure how I can make the code more efficient.

Hi Mai, and welcome!

We’re happy to help, but:

Please show us the code as text, together with any errors you got. Not as a screenshot or photo. We don’t use Photoshop to edit our code.

https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-errors-when-asking-a-question

It will also help to show so sample data, say, with N and M around 6 or 8.

So I don’t have any code errors right now, since the code works but I want to optimize it, but I will upload the code from my jupyter notebook. I’m not sure how to show the results from the jupyter notebook run without a screenshot.

I just updated it with some sample data and the time given for the run by using %%timeit

I’m not a Jupyter expert, but I would expect that selecting the text with your mouse and copying will work.

If it doesn’t, that would surprise and horrify me. Copy and paste is basic functionality, and it would be shocking if Jupyter didn’t support it.

Googling for more information:

https://www.startpage.com/sp/search?query=copy+text+from+jupyter+notebook

suggests that maybe it is harder than it should be.

Alternatively, you could try attaching your notebook, although I don’t know if that is permitted by the Discuss software.

Your code:

def less_than(W, Q):
     return [W[i] < W[j] for i, j in Q]

seems pretty optimal to me. What makes you think it isn’t?

So I thought it was optimal as well, but I had someone else review the code and they said the time complexity for the code was O(n^2), making it suboptimal, and that I should try to rework it. However, using research, trying stuff out, and google, using < or other operators is the only way to lexicographically compare strings in Python, and using for i, j in Q as part of a list comprehension seems the easiest way of iterating through Q without having to use a for loop or complicating matters by using something like the enumerate() method. Hence, why I am at a loss as to how I can go about further optimizing the code