Python has a reputation for being slow, is that reputation warranted even you're using numpy effectively?

jagerber · November 22, 2024, 9:55pm

It seem Python ubiquitously has a reputation for being slow because it is not a compiled language (I might already be saying something incorrect…). I’m a scientist so the times I care about performance are when I am doing heavy numerical tasks. However, with some care, I can usually shape my calculations to make efficient use of numpy and vectorization. Doing so usually makes my calculations pretty quick. If I’m really clever I can do multiprocessing to sometimes get speedups. Or maybe libraries I use under the hood already do some multi-processing.

So, if you are using numpy effectively then is python appreciably slower than other languages you might choose to do numerical calculations in? If the answer is no there still might be an argument like “yes, python can be as fast as rust or some sort of C if you’re really careful, but in these other languages it is easy/natural to make your code fast”.

Another note, I’m specifically asking about numerical tasks. It might be that python is slow for non-numerical tasks and there isn’t a “numpy equivalent” to speed it up. That would also be interesting to know. In my work it’s pretty rare that I’m doing something non-numerical with python where performance matters.

onePythonUser · November 23, 2024, 1:50am

Hello,

if you’re really pressed for time, create the equivalent code in C by creating a C extension that runs the equivalent code that you currently have the Numpy library functions doing. Use a timer to compare the latency between the two. Perform the same operation many times over (books generally have them performing the same function in a loop 500k - 1M times). This way you’ll have a practical measurement as opposed to remaining in a theoretical space.

Something like this:

from time import time

def timer(func):
    
    def wrapper():
        
        t1 = time()
        func()
        print('Runtime is: ', (time() - t1))
        
    return wrapper

@timer 
def your_function_to_run():
    
    # Code to run here
    # I just entered a misc loop as a place holder for now
    for i in range(1, 500000):
          t = i ** 2
    
your_function_to_run()

Once you have a practical measurement, you’ll know whether Numpy is comparable to C or not and if its worth the effort of creating C extensions instead of using Numpy libraries.

jagerber · November 23, 2024, 2:22am

Thanks for this response. Yes I could benchmark code and probably should. But I am more wondering in a vague “theoretical” sense. If someone says to me: “Python is so slow, you should switch to X” can I correctly respond by saying: “Actually if you use numpy appropriately you can get practically as good performance in python as in X, at least for the performance critical numerical stuff”.

Is the answer really:

“there’s no general statement that can be said about this? It has to be considered on a case-by-case basis”?

Or can we generally say

“Yeah, on most problems you can probably get as good performance with numpy as you can with any other language”

or

“No, python is slow, period. If performance matters you should definitely move to another language.”

Rosuav · November 23, 2024, 2:29am

Or the answer is “No it isn’t, and I shouldn’t”. A claim with no factual basis can be refuted with no factual basis.

onePythonUser · November 23, 2024, 2:30am

Well, from a practical sense, if you can get your hands on actual data to back up your assertion (for or against; actual measurements), why not? Why stay in a theoretical space?

On a related note, it is not only the language that you’re using for your calculations (Python, C, etc.) to take into consideration, but also the speed of the system that you’re running on. If user A has a system that runs at, say, 5 GHz, and using C and user B has a system running at 15 GHz, but using Python, who’s faster? But all things being equal, taking measurements should remove all doubt.

flyinghyrax · November 23, 2024, 4:13am

As a generalization, yes, a Python program will execute more slowly than an equivalent program written in a language that compiles to native code, like C or Rust.

But.

Native extension modules like Numpy are exactly that - compiled from C to native code. (Some popular native extensions for data processing are being written in Rust now too, I think?) And depending on your domain of expertise, it may be unlikely that you could write something like Numpy yourself in any language, and have it be as fast as Numpy is - Numpy has been around long enough, and used by enough different people, to discover and fix lots of problems and add optimizations but by bit.

So another thing you could say - as a generalization - is that the parts of your program that are running within the native module, are as fast as those parts could be in any other language. So if your program spends most of its time using Numpy, then converting the remainder away from Python doesn’t have as much weight. The Python parts of your program might be slower, but if it was easier for you to write in Python than in C++, then you might spend more time trying to rewrite your program then would be saved by making it faster.

There is a maxim called the “80/20 rule” that gets applied to a lot of areas of software development. (There are also plenty of variations and it’s not only used in software, so this is just the version I’ve heard.) The idea is that the first 20% of effort (or time, or money) yields 80% of the value, and that it takes the remaining 80% of your effort to get the last 20% of the value. There are a lot of areas in software engineering with diminishing returns as effort increases, and performance optimization is definitely one of them.

I think of Python in a similar way. It tends to be much easier to get something working than in other more complicated languages, and the performance is usually good enough. Not 100% optimal of course, but good enough that I have to think hard about whether it is worth my time to make it faster. I would need to spend proportionally more effort, to get proportionally less value, and all the while I could’ve been doing something else! The think it’s important to consider how difficult and time consuming it is for you to write, debug, and update your program, not just how quickly it runs, as long as it’s fast enough that it doesn’t cause a problem for you.

elis.byberi · November 24, 2024, 1:19am

If BLAS and LAPACK are being used, the performance will be the same. This conclusion is based on a general assumption, not empirical data.

I’m not sure what to make of all these blazing-fast programming languages when it takes a decade for a usable framework to reach a stable version.

petersuter · November 24, 2024, 9:17am

Can you share your favorite “recipes”?

The most common basic replacements I have used successfully:

for loop → numpy array operations
- C: for (int i = 0; i < N; i++) a[i] = expr(b[i])
- numpy: a = expr(b)
iteration variable → numpy.mgrid
- C: for (int i = 0; i < N; i++) a[i] = expr(i)
- numpy: a = expr(np.mgrid[0:N])
if statement in loop → masking
- C: for (...) if (cond) a[i] = ...
- numpy: a[cond] = ...

Surprisingly often that seems to be all that’s required, and (combinations of) these can turn e.g. 100x slower Python for loops into performance close to C (or maybe 10% to 50% slower).

I have used these at some points:

Summation / Accumulation → numpy.cumsum
- C: for (...) sums[i] = a[i] + sums[i-1]
- numpy: sums = np.cumsum(a)
Differences → numpy.diff
- C: for (...) diffs[i] = a[i] - a[i-1]
- numpy: sums = np.diff(a).

Do you know more such “general recipes” that are often useful? Other clever techniques worth knowing? Do you consider them generic techniques or just lists of special cases? Do you memorize them, look them up in the numpy docs when needed, or consult resources like StackOverflow and LLMs?