How can I generate my own .pyc?

Hi, everyone!
I want to change my own demo language in Python VM, so I think i can translate demo language program to .pyc then use python xxx.pyc to execute. But I don’t know how to generate my own .pyc. Can you help me or Is there any information I can refer to?
Thanks for your help!

Hi, I think you can generate Cpython from your Python code by compiling it with Cython.

Use py_compile, or compileall if you have more of them.

2 Likes

Sorry, I mean that I want to create a language using python VM. Like Scala will compile its program to java bytecode, I want to create a tool to translate my new language to python bytecode, but i don’t know how can i do. py_compile cannot compile my new language.
Thank you!

Look at what Coconut does:

or these:

http://www.dalkescientific.com/writings/diary/archive/2007/06/01/lolpython.html

Thank you!
Cocount is useful. I think I can learn ByteCode rules to create my own language files.

I recently learnt of RPython. I think it’s PyPy’s project to help anyone build a language interpreter using Python. It’s worth checking out if you really wanna pursue this journey.

AFAIK, the best way is to generate Python source code, and then use py_compile on it. It seems that’s what Coconut & Like Python do.

Compiling directly to bytecode (.pyc) would be hard. Unlike Java, CPython’s bytecode is not stable: it can change in every minor release (3.8, 3.9, 3.10, …). Sometimes the changes are small, but sometimes the format is changed entirely (like in 3.6).
So if you want to generate CPython bytecode (.pyc), expect that you’ll need to target a specific version, or generate multiple versions of bytecode – and add new code every once in a while to keep up.

3 Likes

Maybe related:

Thank you!
I just want to try this in 3.7, and I find a lib “bytecode” can help me to generate .pyc. It can generate codeobject by bytecode, and i just need add pyc file’s header (like magic, moddate…), then i can generate a .pyc which can run. it’s a experiment not for application.

The pyc header is undocumented, and can change in future versions of Python. But currently (since PEP 552 in Python 3.7), the most common format is these four 4-byte little-endian values:

If there’s no source file to check against, the last two can be anything.
You can convert an int to bytes with number.to_bytes(4, 'little').
The code, serialized with marshal, starts right after the header.