It would be better for Python to support AOT compile officially

I am very happy to hear that Python now supports JIT(Just In Time) compile feature!

Although it makes Python program run much faster than now, still some problems remain.

One is that we still need Python to be installed at end-users’ computers. Another one is that the program will be still slow at the starting moment.

Supporting AOT compile is the key to solve both problems above.

Moving this to help as it is not formulated as a proposal that will actually get constructive work towards the idea in motion. Additionally, there are already projects that compile python to executables.

One is that we still need Python to be installed at end-users’ computers. Another one is that the program will be still slow at the starting moment.

Supporting AOT compile is the key to solve both problems above.

I both fail to grasp your point and fail to conceive of any way forward that doesn’t require huge refactoring of the entire CPython codebase, and numerous breaking changes for millions of users.

AOT is fantastic for those that need it, those who opt-in (e.g. Numba users, or those considering Mojo).

I’m not sold on the benefits of breaking such a well established language, even one not as established as Python, e.g. for all those millions of usrs who only need to Automate The Boring Stuff.

Whereas on the other hand, instead of breaking an entire dynamic, language, those of you that want an AOT static binary or ‘executable’ can simply pick one of any number of alternative tools.

That said, I’m all ears. But you will need to make an exceptional, ground-breaking case.

1 Like

Even though Python doesn’t need to be a high-speed language for those who use it for automation purposes, a significant reason why Python is used today is for computational science and data processing. In these fields, high-speed performance is extremely, extremely important.

While there are alternatives like Julia that offer high-speed performance, Julia has far fewer users compared to Python. This also means there are fewer libraries available for Julia. Converting all of Python’s numerous libraries to Julia would take more time and resources than making Python as fast as Julia.

I understand that some believe it’s very difficult to support AOT compilation for Python because it’s a dynamically typed language, or that supporting AOT would require mandatory type declarations, thus negating Python’s advantage of easy coding. However, Julia, which is also a dynamically typed language, supports both JIT and AOT compilation. (If you’re curious, read this article: Julia Documentation · The Julia Language)

Julia isn’t the only example. LuaJIT, an implementation of the Lua language, supports both JIT and, despite its name, AOT compilation. (Read Quick, JIT — Run! on https://staff.fnwi.uva.nl/h.vandermeer/docs/lua/luajit/luajit_intro.html.) Considering all this, it’s clear that AOT compilation support for Python is not technically impossible.

Even though Python has third-party implementations (like Numba), compatibility is usually guaranteed only for specific libraries. For instance, Numba primarily guarantees compatibility with Numpy, but not with other libraries. This is fundamentally because it’s a third-party implementation and libraries are not created to be compatible with it.

If Python, more precisely CPython, is made fast enough that third-party implementations are unnecessary, this issue could be resolved.

Python has already undergone a major revision from version 2 to 3. And those who used version 2 are now using version 3. People might need to transition to version 4 if that’s what it takes, and they would likely do so. Even those using Python for automation purposes wouldn’t prefer a slower version.

1 Like

This thread is essentially a duplicate of this one: AOT instead of JIT

And you are addressing none of the counterpoints given to you before. Especially see this post in the other thread:

Go and write a Python compiler then, and let us know how you get on.

Take a look at mypyc.

1 Like

I’m the author of this thread: AOT instead of JIT
Here are some points to consider based on the experience with the last discussion:

  • AOT vs JIT is not purely a technical decision. It also has a lot to do with what users perceive as programming style and deployment practices.

  • The AOT world may not be familiar to many Python users, especially those active in this community, since many have backgrounds in statistics and other scientific fields, not software engineering or computer systems.

  • Many users are unaware that modern AOT technologies have advanced significantly , to the point where they can provide the same advantages of dynamic typing with a REPL (Read-Eval-Print Loop).

  • There’s nothing wrong with CPython for its intended historical purposes. It’s a simple VM for translating Python scripts, and recent improvements have made it even better at that. However, any real changes would require a completely different implementation.

  • Python is maintained by volunteers who may not have the time and resources for radical changes. You might encounter responses like “Go write one” or “Submit an official PEP” when trying to discuss the topic, which can be frustrating, but that’s how things are currently handled.

  • Many with serious backgrounds in computing use Python for lightweight tasks or as a prototyping language. Once the idea is clear enough, the production implementation gets done in other languages like C++ or Rust.

  • For heavy workloads, consider Rust. It integrates well with Python, has a high-level syntax and functions (very functional by design), and a rapidly growing ecosystem of libraries with significant investments, especially for massive data processing (Polars, Apache DataFusion, etc.).

  • There are attempts like Mojo to create a language similar to Python with interop with CPython. It incorporates features from Rust and Swift to combine usability with performance and type safety, but it’s still under development.

I hope these points are helpful regarding this ‘duplicate’ discussion.

3 Likes

George, unlike OP you’ve not only done your homework, you show considerable expertise too.

You might encounter responses like “Go write one” … which can be frustrating,

Yes. A small fraction of the amount of frustration it’s possible to experience working with modern AOT compilers. Let alone writing one.

1 Like

I would never, under any circumstances, call a discussion a “rant” especially if it is a respectful discussion.

Unfortunately, in recent years Python has attracted many people who have little or no ability to conduct discussions well, and this is the main source of friction and frustration you might encounter.

No one called the discussion as a whole a rant. One user called their own (well received) comment a rant.

The main source of friction in Ideas is poorly thought out proposals and people who can’t accept that their ideas aren’t good/actionable/implementable/etc.

2 Likes

I think a packaging compiler is another sort of story than a compiler.

I also think that when a JIT compiler generates machine code, it can be pickled so that the next time the “JIT-decorated” code is ran, the machine code can be unpickled instead of recompiled again.
But, for example, numba provides a JIT that works with the gil, and an AOT that requires the compiled functions to be working without (in the numba way). → so it is not “so” easy to pickle machine code.
(I might be inaccurate on some points)

Could anyone kindly enlighten us about the current and planned state of picklability of the JIT machine code ?

That would imply that you can unpickle machine code from a file and execute it. Is that what you want? Remember that pickle files can be hand-crafted as needed.

Yes, but I think (from the little I did read) that the JIT is optionally enabled as an option external to the python code, so it is not obvious how it should be done. I don’t think there is a python object enclosing the machine code… or maybe you pickle some jitted function… I don’t think so either.
Probably the machine code would be contained in a file just like the .pyc files contain bytecode, and would not be recompiled if the file already exist for a script, I think that would be the best and simplest way. I am just fairly curious about it.

Don’t even think about that part. Just think about this part: As a normal part of calling a Python function, you could load native executable code from a file on disk, and run it. A file which could be damaged, or maliciously formed.

Is there a particular risk with regards Python users doing that. As opposed to running any binary, or indeed compiled Python library?

1 Like

From what I recall for .pyc file, the .py and the .pyc are hashed to validate the .pyc to make sure any modification on the .py will update the .pyc, this acts as a security check alongside.
It should make perfect sense the JIT follows the same logic.

I don’t think the machine code file is meant to be transferred across computers. Yet if there is a way to do so, you cannot check its contents.

With those, it’s very clear that something is being imported from a binary file, which means it triggers audit checks and such. If JIT-compiled functions can be saved, any call to any function could run native code. I don’t think that’s a good thing.

That’s another reason to not save it to the pyc IMO.

Unless you disable ctypes, this is already true, although it requires a bit of extra effort to achieve.

The point is that unpickeling can run arbitrary python code and arbitrary python code is going to find a way to run arbitrary machine code.

So I don’t think saving machine code increases this risk vector. But I don’t believe there is any value to it currently, and I softly doubt there will be for quite some time.

Ah OK. Cheers. That’s a concern for sharing AOT compiled Python functions, similar to how Numba does it currently. The Python run-time is quite a privileged environment.

I was thinking of more traditionally compiled Python code, resulting in actual binaries. I’ve been meaning to have a play with Codon.

My concern was actually primarily about the possibility to JIT-compile functions/classes/modules/whatever… then store the machine code locally to avoid the JIT compilation overhead at subsequent import/run/instanciation/whatever…