"AI Python compiler" - Transpile Python to Golang with LLMs for 10x perf gain? PyPI-like service to host transpiled packages

JuroOravec · September 15, 2025, 5:25pm

TL;DR - Imagine you write the code in Python, and in CI an LLM tool rewrites it to Go, and re-exposes the Python API as Python package released with Go equivalent of maturin.

You get the speed of Go, but you develop your code in Python.
The users of your package still use it as Python package.
The LLM automatically does the conversion.
The LLM also writes tests to ensure that the conversion is correct.

I’d like to explore this in maybe 3-6 months time, once I’m done with the work around my other project, django-components. So just wanted to post this here and let it ferment.

If you find this interesting and would like to join me, leave me a message. Or even if you think this could never work. Also, ideas on how this could be funded are welcome.

Background

I’m freelance developer (here’s more on me) and for the past 1.5 year I’m also co-maintainer / developer of django-components - a frontend framework for Django.

In the last couple of months I was exploring how I could get funding for our project, which forced me to examine how our project, Django, Python web development fits the overall web development scene.

There’s this saying that “On the Internet, the winner takes all”. JavaScript web dev ecosystem (React, Vue, etc) is larger, more mature, and IMO it overall feels much faster to go from zero to completed project. So a frontend framework for Python and specifically Django feels like a niche that can have hard time being funded. But I’d like to continue working on django-components, and get it to v1…

So had to ask myself - What could be a unique take that would make django-components competetive (or at least interesting) again other established frontend frameworks?

At the same time, I’m someone who always wants my packages / code to be faster and better. This even led me to off-load some parts of django-components to Rust with maturin. And frankly, the ease of use with which one can build Rust-Python hybrid packages with maturin / PyO3 is incredible.

(There may be similar tools like maturin in other programming languages, I haven’t explored that.)

Modern JS frontend frameworks often rely on purely client-side solution - The frontend code is pre-compiled into static HTML / JS / CSS assets. On the other hand, with non-JS web frameworks, the UI has to be rendered / prepared on the server.

But there is also server-side rendering (SSR) for JS frontend frameworks like NextJS. And maybe, if we compiled our Python code to Rust, and packaged it with maturin, it could be faster than NextJS?

With current advances to LLMs, the conversion from Python to Rust could be possibly automated. Imagine you write the code in Python, and in CI a tool rewrites it to Rust, and re-exposes the Python API as Python package released with maturin. LLMs are NOT great at generating new knowledge, but they are good at transforming same info into different formats, so this could work.

While the idea sounds good, I learnt that the reality is that SSR is only a fraction of infra cost for large websites. I also benchmarked various scenarios, and having a Go/Rust server with AlpineJS for client-side interactivity was only about ~20-30% faster than Nuxt (Vue) and ~50% faster than Next (React). And the setup with client-side Vue and React + REST server was surprisingly fast even if they have to send subsequent requests to the server to fetch page data (here data hydration with Alpine was the limiting step for Go+Alpin setup).

(I tested “time to interactivity” - how long it takes to load the page, wait for 1000 items to load, and wait until the framework (Vue/React/Alpine) responds to me clicking on the last item.)

Overall, this approach was interesting, but it wasn’t yielding a performance boost that would justify migrating from Vue or React to Python for web development.

However, that made me think - what if, instead of focusing only on web development, we try to apply this to the Python ecosystem overall?

Proposal

Taking this idea more seriously, there are a few changes and thoughts:

Instead of Rust, transpile Python to Go. Go and Rust are similar perf-wise. But Go is closer to Python in syntax and features - Go has garbage collection like Python does, and Rust’s borrow checker and lifetimes would be a pain to make work well.
For exposing Go to Python, there’s gopy. It’s not as polished as maturin, but gets the job done for a proof of concept.
Transpiling organisation’s Python business logic code doesn’t make much sense, if that logic simply calls slow Python libraries (e.g. Django). Rather, it should be open source packages that should be transpiled to Go first. Only after all project dependencies have their Go equivalents, would it make sense for orgs to convert also their code.
If a package (e.g. Django) is transpiled to Go, how would the organisation’s business code access the transpiled open source package? We’d need to have a package registry like PyPI, that would store the Go-transpiled copies of the packages that are otherwise available on the PyPI.
Versioning - This also means that each different version that’s on PyPI would have a corresponding Go copy.
Each Python project converted to Go would need to store that Go code in a (publicly available?) source control (e.g. Github). That way, when Python package A depends on Python package B, then A’s Go equivalent could directly point to B’s Go equivalent.
This would need to be online service / SaaS, since this would rely on LLMs for transpilation, database for storing the Py→ Go relationships and entries, and for hosting the converted packages so they can be downloaded.
The Py→ Go conversion would be part of the package’s build step. E.g. by collecting the source distribution, we can get a directory with files that we know need to be transpiled.
Testing - To ensure that the Go code is correct, the LLM would be prompted to write tests for the 1) initial Python code (if missing), and then 2) rewrite the same tests in Go.
Cost - Hard to say. In my proof of concept, even a minimal project (2-3 files, ~3 small functions) the whole flow took about 3-4 minutes of LLM compute (don’t know how much tokens that was). But obviously I didn’t try to optimize anything.
Edge cases - Packages that rely on magical behaviour (dunder methods) could be harder to transpile.

The above is not an exhaustive list, I’m sure there’s a lot more nuances. So I’m curious to hear other’s thoughts.

Why going through the hussle of transpiling from Python to Go? I still need to finish my proof of concept. But I hope/expect that this could lead to at least 10x perf boost, despite the need to convert between Go/Python objects on the interface of the Go-compiled packages. And I expect that that’s much better than what we can expect from gradual improvements to Python runtimes in the next few years (tho I might be wrong).

Also to be clear, I’m not advocating against working on CPython and similar. I just think that in the age of LLMs, this might be an effective way of making Python code more performant.

davidism · September 15, 2025, 5:28pm

This appears to be an advertisement and announcement for a personal project you intend to work on in the future, not an idea for something that will be part of Python/cpython. Moving this to Python Help, the general category for project announcements.

JuroOravec · September 15, 2025, 5:53pm

I mean, it’d be great if this could be part of Python org. But how likely is that? You tell me.

bwoodsend · September 15, 2025, 6:37pm

We already have Cython as a quick and dirty way to turn large amounts of Python code into native binaries. It’s become quite a honeypot due to lack of awareness of the bottlenecks in real code. If it’s not given something that that’s a) CPU bound, b) bound mostly by overhead that’s Python-specific and c) the bottleneck of whatever you’re doing then the performance boost is underwhelming and you’ve just made something much bigger, much less portable and with higher memory consumption^[1] than before for little to no gain.

I imagine that mass, untargeted translation into Go will be just as fruitless although don’t let that stop you from trying to prove otherwise.

possibly raising server costs instead of lowering them ↩︎

JuroOravec · September 15, 2025, 9:18pm

Thanks, I haven’t thought of Cython, I worked with it very little, but that could work even better

What do you mean by the honeypot? As in red herring?

And yeah, thanks for the disclaimer, this Reddit thread also contains similar good advices. So my take-away is that if we wanted to let LLM optimize Python code, we should also give it the tools and the right instructions to be able to profile the code.

Side note, I read somewhere that Python is 1/3rd data sci and ML, 1/3rd web dev, and 1/3rd other (dunno how much this is true). First group already has numpy and such, the second group is I/O-bound, so ths would be mostly relevant for the third group.

Eventually I’d like to try the approach above on django-components, as majority of what it does it practically data processing.

On portability, this is to some extent solved by how maturin / PyO3 recommends to build packages, by compiling them for many different targets. E.g djc-core-html-parser ended up with over 100 builds. But yes, they are all much larger than the original source.

And why would the memory consumption be bigger?

bwoodsend · September 15, 2025, 11:06pm

Pretty much. Something thar looks tempting at first but has hidden costs. In Cython’s case, on the surface it can look like the speed of C with the simplicity of Python but in practice, the hidden added complexity to the packaging/deployment side of going non pure Python is often much worse than writing the C code itself (and ironically, that cost is higher for Cython).

Even that last 3rd (assuming these fractions are accurate) will contain GUI developers using Qt/Gtk/tcl/etc which do all the real work in C/C++ or system administrators who mostly just fire off subpocesses. I personally work mostly in packaging and testing, both of which are bound by IO or subprocesses. ^[1]

More symbols loaded into memory I guess. I don’t actually know how it works. I just sometimes take a library that uses Cython or Rust but allows a pure Python opt-out and compare the two in Docker containers and the memory increase is often much higher than the performance gain.

I don’t even bother turning on optimisations when installing Python because it has so little effect on anything I do. ↩︎

kevrod07 · September 16, 2025, 10:59pm

I have thoughts on this, I’ll answer in two parts

Part 1: The complexity you’re underestimating

The LLM automatically does the conversion.
The LLM also writes tests to ensure that the conversion is correct.

Each of these steps on their own are active research problems being worked on across the industry. Entire teams of engineers at extremely well-funded (if not basically unlimited) companies are spending huge amounts of time and money just trying to get either piece (reliable code generation or reliable test generation) to work well.

Why going through the hussle of transpiling from Python to Go?

what you’re describing isn’t just “transpiling”; it’s AI code generation, semantic verification, and packaging/distribution, all stacked on top of each other. Each layer adds fragility, cost, and uncertainty. Even in narrow, well-scoped cases, correctness and reproducibility are hard to guarantee, and you’re trying to do it across languages.

kevrod07 · September 16, 2025, 11:08pm

Part 2:

My background: I’m currently an engineer at Codeflash where we’re tackling a similar problem - combining Python, performance, and LLMs.

How? We’re using LLMs to optimize existing Python code in-place. This includes both pure Python and code that already leverages high-performance libraries like NumPy, Pandas, or NetworkX. The key thing is that we can extract significant performance improvements while staying entirely within Python. No need to transpile to go or rust or similar.

Here are some actual examples from our work:

⚡️ Speed up function `solarize` by 77% by aseembits93 · Pull Request #2376 · albumentations-team/albumentations · GitHub 77% speedup - Replaced list comprehension with NumPy array for LUT creation, used np.where for conditional assignments
⚡️ Speed up function `calculate_bbox_areas_in_pixels` by 154% by KRRT7 · Pull Request #2363 · albumentations-team/albumentations · GitHub 154% speedup - Eliminated unnecessary memory allocation and array copying
⚡️ Speed up function `select_top_confidence_detection` by 188% by misrasaurabh1 · Pull Request #1092 · roboflow/inference · GitHub 188% speedup - Single-pass np.argmax() instead of two-pass max finding
Change `GenerateJsonSchema.resolve_ref_schema` implementation to not use recursive calls by misrasaurabh1 · Pull Request #11228 · pydantic/pydantic · GitHub 112% speedup - Converted recursive to iterative approach
perf: ⚡️ Speed up `convert_kwargs()` by 9% in `src/backend/base/langflow/interface/initialize/loading.py` by codeflash-ai[bot] · Pull Request #2529 · langflow-ai/langflow · GitHub 9% speedup - Switched from stdlib json to orjson
⚡️ Speed up method `Graph.find_runnable_predecessors_for_successor` by 129% in PR #6309 (`fix-order-loop`) by codeflash-ai[bot] · Pull Request #6310 · langflow-ai/langflow · GitHub 129% speedup - Reduced redundant lookups from 3 to 1 per recursion level
optimized squared and Euclidian dist calculation by 130% by sidd-27 · Pull Request #3218 · kornia/kornia · GitHub 130% speedup - Replaced matrix multiplication with direct dot products using algebraic identities

In practice, the workflow looks like this: the LLM proposes optimizations, and Codeflash itself handles benchmarking and empirically verifies correctness. all automatically.

Speaking from experience, this is really hard. if you set out to do what you proposed, you could even start your own company