Run python natively on gpu

there should be a feature to allow python to natively use the gpu to run code faster and also would allow to use graphics api’s and more

Oh my sweet summer child…

This idea has a lot of kinks and I can explain them to you.
First, do you want a more in depth description, or a short response to this?

First, do you want a more in depth description, or a short response to this?

In depth please! I’ll put the kettle on.

The Hierarchy of Computing

At the most fundamental level, we have hardware. On the opposite end of the spectrum are high-level, object-oriented programming languages like Python. These languages are easier to understand but rely on lower-level languages to function, much like a stack of books where each layer supports the one above it.

Applying the Analogy to Programming

Integrated Development Environments (IDEs) such as PyCharm come with an interpreter, whose purpose is to execute the code. Python code can be compiled into executable files, but these executables depend on interpreters and command-line interfaces to run properly. These are known as dependencies.

Dependencies are essential because high-level languages like Python do not execute their code or render a user interface without external modules, often written in lower-level languages like C.

To illustrate, when a Python script runs, it goes through several layers:

  1. The Python script is created, compiled, and executed.
  2. The interpreter (likely written in C or C++) interprets the code.
  3. The interpreter instructs the command line interface to echo “hello world.”
  4. The command line interface makes a system call to the kernel.
  5. The kernel processes the call and instructs the display protocol to draw pixels at the designated location.
  6. The display protocol forwards the command to the GPU.
  7. The GPU sends signals through the HDMI port to the display, which then draws the pixels.
  8. The “hello world” program completes successfully and exits with code 0.

Python cannot directly interact with the GPU because it operates at a higher level in the computing hierarchy. Achieving direct interaction would require writing in lower-level languages like C, which involves overriding the kernel and display protocols, running scripts with elevated permissions, and more.

Conclusion

Direct interaction between Python and the GPU is not feasible due to the hierarchical nature of computing. For graphical applications in Python, consider using libraries such as Pygame.

Thank you for your attention. If any part of this explanation is unclear or overly technical, please let me know.

Fair. I would just point out, that despite the unstated assumptions in OP’s request (all workloads including non-parallelisable ones, all Python code, “the GPU” implies support for every GPU on the market, lack of OS and platform means support for all platforms and OSs). Not to mention that “and more” means scope creep is baked in from the very first sentence of the feature request.

Despite all that, there are Python libraries that can run certain types of code faster, that can be run on certain GPUs on certain machines and OSs, e.g.:

Thank you for the clarification.

I must have missed when the user added “and more” to the end of the request.
However, part of my response was mostly as an example, and the main point is “No, Python cannot run on the GPU.”

Despite all that, there are Python libraries that can run certain types of code faster, that can be run on certain GPUs on certain machines and OSs, e.g.:

I think ultimately, in junction with your statement, the user should consider Python libraries as a soloution to their idea.

Problem solved.

The GPU is great a parallel computing.
Python is not parallel in nature.

It is therefore unlikely to “run code faster” when python interpreter runs on the GPU.

Running code on a GPU is becoming easier.
Someone ported the classic DOOM game to run on the GPU: https://www.phoronix.com/news/DOOM-ROCm-LLVM-Port

That’s fair.

If you REALLY want to run Python code faster, use libraries like NumPy or PyPy.

The answer to everything Python is libraries.

Heheh, yeah, there’s a lot of unstated assumptions here that others have addressed. But general meta-advice here is that if you ever find yourself asking, “They should just do this…” on a topic you don’t know a lot about then there’s probably several good reasons why people aren’t already doing that. It’s better to phrase your question as, “Can you explain why they don’t do this…?”

(It’s kind of like asking “After airline crashes, they always find the black box intact, so why don’t they just make the entire airplane out of the stuff they make the black box?” If it were that easy, they’d already be doing it, so it’s better to ask why that doesn’t work rather than suggest they do it.)

I’ve Benefited from GPU acceleration through using PyTorch. It uses cuda under the hood if you have an NVIDIA GPU. Python is no less prepared for GPUs than other general purpose programing languages. For example while Go Lang more naturally uses multiple CPU cores for some applications facilitated by channels and Go routines, Go code is no closer to using GPU instructions or vectorizing and staging data than Python. Go must use libraries like CUDA or OpenCL to use a GPU just like Python does.

:wave: I work on the CUDA Python program at NVIDIA

Here is an open resource to learn the details of getting closer to the goal of putting Python on a GPU. As most have said, general Python code (like general C or Fortran) is going to be slower on a GPU than a CPU.

Libraries like PyTorch, Accelerated Pandas, NetworkX all have low code changes to run natively on the GPU. Get started there, then you can do more with cuPy as a numpy replacement, but as you find you need more general code, Numba-CUDA can help compile user defined functions and types directly to the GPU.

Can NetworkX do all the calculations on a GPU?

No, not all at this time. I see that there are 60 algorithms listed in the product release. Searching ‘cugraph’ on networkx documentation leads to some details on which are supported. The benchmarking code also shows how different usecases are currently running.