Python source code to dynamic-link library

xxyu10 · June 20, 2021, 12:38pm

I’m a Python developer. I’m trying to wrap all Python source code into a dynamic-link C library, so that I can run Python scripts for any projects on any linux server without actually installing a Python interpreter. I’m thinking about if possible. using gcc to run a .c file converted from a Python start-engine script, while linking the dynamic library originated from Python source code, which can be used as a Python interpreter. Is this idea feasible?

xxyu10 · June 23, 2021, 1:33am

Can someone please help me with this question? thank you so much ~

Mholscher · June 23, 2021, 12:29pm

What is it you are trying to achieve? As far as I know, any Linux distro comes with a Python interpreter. And if it doesn’t, it is usually only two commands away.

In the previous line you say you want to run without an interpreter and here you are saying you want to create an interpreter. What is the difference with just using a pre-packaged one? Also this means that you will be having one interpreter per program, which will eat a lot of disk and may lead to incompatibilities.

If you are looking at a compiled version of Python programs, look at PyPy. I have not used it myself, but it has made a good name for itself.

If your idea is feasible, I don’t know.

xxyu10 · June 25, 2021, 8:43am

Hi Menno,

Thank you very much for your reply. I think I may have to clarify what I wanna achieve.

Think about it, you just developed a Python project, which consists of a bunch of .py script. You can run it on your own linux server, which of course, has a Python environment.

But you have to implement this Python project into another linux server, which has no Python environment, or with a Python environment but get a version issue, or with a Python environment but doesn’t have some third-party packaged your project rely on (numpy, pandas, tornado, tensorflow…). Trying to make sure and complete all the environment you need can be very troublesome. Even if you fix one linux server, maybe someday you get another one to implement, and you have to perform the same procedure again, again and again…

I wasn’t saying I wanna create an interpreter. My goal is to wrap the python interpreter and all frequently used packages (numpy, pandas, tensorflow…) into a big dynamic library(.so) on linux, from my own linux server, which of course, has a Python interpreter environment. Also, I’ll convert my .py script of my project to .c or .so using cython and gcc command from my own linux server.

In this way, after this big dynamic library is generated, I can copy this file to any other linux server. Then, after I copy my project to this linux server (already .c and .so), I can run it through linking with the big dynamic library above. I don’t have to use an Python interpreter, and also I don’t have to install any third-party packages like numpy or tensorflow. Is this possible?

Mholscher · June 25, 2021, 11:46am

I now understand what you are trying to do, thank you.

You underestimate what you have to do to keep the software up to date

If you are making this big shared object with the library, you are responsible to keep it up to date. If a security hole is discovered in the standard library, you need to deliver a or multiple new version(s) for all installed instances.

If there is a security hole in the C compiler used, you are responsible for creating and distributing the versions of your software compiled with the new version of the compiler.

People will want to use the features of new versions of Python. E.g. in Python 3.10 pattern matching is added to Python, which is very useful. People (including you) will want to use that in their software. So you have to create new packages at least every two years, besides the security fixes.

You have to write one or two compilers

To go from Python to C or a you have to get a compiler. You also have to create the shared object with the Standard Library, which is bilingual (Python and C-extensions). I think there are examples of compilers that compile from Python to C, but these are not widely used.

Shared objects are version dependent

There is a chance that shared objects that are needed to make your programs run, will change. E.g. a while ago Apple deprecated the use of TLS1.1 for the version they deliver of openSSL. The software you make will have to be adjusted for these things, while when you use the “packaged stuff” other people (the distribution, Python core developers) will make sure things keep working or show meaningful error messages.

You overestimate what you will gain

And what will you gain? You copy a few gigabyte and the first thing you will get is complaints that the version does not match the version installed, so the trained model from Tensorflow is not usable outside of your sandbox. I see no advantage over a well made requirements file, and/or a virtual environment.

It may be possible, but not preferred

It may be possible, but I would not prefer that over maintained, well known software that I can get support for from the community.