Cpython behavior regarding pycache folder on read-only volume

akshatmittal1992 · August 4, 2023, 1:37pm

Hey Team,
I am using python on read-only volume. Hence the pyc files are not written to the pycache folder on the disk. I wanted to know the behavior in this scenario as well as in the scenario where we pass in PYTHONDONTWRITEBYTECODE=1. Are these two different or does it behave in the same way?
Does it store the pyc file in in-memory?. If yes, when does it flush it out?
Does cpython interpret and generate the bycode everytime a package is imported?
Does it delete it and regenerate it everytime?
What is the performance impact for not saving in pycache directory? Is it quantified anywhere so that I can take a look which will help us make a decision.

Thanks,

Rosuav · August 4, 2023, 1:50pm

Let’s take a step back here.

Whenever you do anything with Python source code, it’s first compiled into bytecode. You can play around with that interactively:

>>> compile("""print("Hello, world!")""", "-", "exec").co_code
b'\x97\x00\x02\x00e\x00d\x00\xab\x01\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00y\x01'

Okay, that’s not very readable, is it. Let’s ask Python to take that bytecode and disassemble it.

>>> dis.dis(compile("""print("Hello, world!")""", "-", "exec"))
  0           0 RESUME                   0

  1           2 PUSH_NULL
              4 LOAD_NAME                0 (print)
              6 LOAD_CONST               0 ('Hello, world!')
              8 CALL                     1
             18 POP_TOP
             20 RETURN_CONST             1 (None)
>>>

(You can also call dis.dis() with the source code itself, and it’ll compile and then disassemble.) This shows what Python is actually doing. It’ll vary a bit from one version to another, but broadly speaking, you should be able to see that it’s looking up “print”, loading the string literal, and calling that.

The reason for the .pyc files is that this can be a bit of work. Not a HUGE amount of work, but it’s some. So once it’s been done once, Python dumps that out into a file, making it quicker next time. The file itself isn’t particularly significant, it’s just a cache of what the interpreter has built.

And that brings us to the read-only file system problem. Well, actually, not much of a problem! What you were wondering is correct: the behaviour is basically the same as DONTWRITEBYTECODE. There’s no in-memory .pyc file, but there is the in-memory compiled code.

Great question, and very hard to figure out. It’ll slow down module imports, but that’s all. So for a long-running program (eg a web app), there won’t be much impact, since the imports all happen once and then that’s it; but for a quick script, where you’re dominated by startup and shutdown time, having those .pyc files can significantly reduce the overhead. You would have to measure for yourself.

Hope that’s enough info to make a reasoned decision!

Topic		Replies	Views
Compileall option to hardlink duplicate optimization levels bytecode cache files Ideas	10	2153	May 14, 2020
RE: New to this language Python Help	3	545	November 8, 2020
How can I generate my own .pyc？ Python Help help	10	767	January 26, 2022
Separation of interpreter and VM? Ideas	11	903	October 26, 2020
Hello, where is the source code of list comprehension of Python? Python Help	10	1617	April 28, 2022

Cpython behavior regarding pycache folder on read-only volume

Related Topics