I have a scenario where I have 300 independently running python scripts. There are resident scripts and scheduled scripts. I want to implement a cache of shared imports. Because under my verification, for example, if I have a script import pymongo, the memory usage will be 6M. When I have 100 python scripts that all import pymongo and are running at the same time, my memory usage will reach 600M, so I want to achieve An import cache, let them load from the cache first, share the import of pymongo, save memory, but I don’t know how to achieve it
Do the different scripts ‘talk’ to one another or share memory/data? You may be able to use something like threading.Thread
to combine some and have them share the same memory space.
(So in other words, run some scripts in different threads from the same starting process).
Though note that the Global Interpreter Lock (GIL) may lower the degree of parallelism of multiple threads at once.
The different scripts not ‘talk’ to one another or share memory/data.
I have considered the plan you mentioned, it will change me a lot, so I have to give up this plan
Python processes can share .dll or .so memory but cannot share the memory used for python code. Only by using a single process can you avoid this issue.
Even forking from a parent will not work as there are ref counts on code objects that force copy-on-write in the child processes.
For a 600MiB cost i would just pay it, assuming you not on a small embedded system
For the commercial python app i work on the lack of sharing costs us 100’s of GiB per server…
Is there no other solution implemented? Such as shared memory and the like, my device is a small memory system, my resources are very limited, so I am trying to optimize, and cannot increase memory
Python is not designed to share memory of its python code as you appear to need.
If you are using linux then you may find that you have over estimated the needed memory.
The code of python itself is only loaded once into memory.
You pay only for loading the python code and the objects that code needs.
Have you accounted for shared code as counted as PSS?
SHR Significantly smaller memory footprint
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13691 root 20 0 254412 20300 5684 S 0.0 0.1 0:00.53 python test.py
13482 root 20 0 254408 20296 5684 S 0.0 0.1 0:00.53 python test.py
test.py
import sys
sys.modules
from memory_profiler import profile
@profile
def enter():
# !/usr/bin/env python
# encoding: utf-8
"""
@desc: 资产入库
"""
import os
import sys
import gevent
import pymongo
print(pymongo.__file__)
print(pymongo.__version__)
import time
time.sleep(600)
if __name__ == '__main__':
enter()
profile:
Line # Mem usage Increment Line Contents
================================================
35 12.7 MiB 12.7 MiB @profile
36 def enter():
43 12.7 MiB 0.0 MiB import os
44 12.7 MiB 0.0 MiB import sys
45 13.9 MiB 1.2 MiB import gevent
46 19.8 MiB 5.9 MiB import pymongo
47 19.8 MiB 0.0 MiB print(pymongo.__file__)
48 19.8 MiB 0.0 MiB print(pymongo.__version__)
49 19.8 MiB 0.0 MiB import time
50 19.8 MiB 0.0 MiB time.sleep(600)
You can find PSS in the /proc//smap file.
The “proportional set size” (PSS) of a process is the count of pages it has in memory, where each page is divided by the number of processes sharing it. So if a process has 1000 pages all to itself, and 1000 shared with one other process, its PSS will be 1500
SHR in the example about will be 2000.
If you add all the SHR together you over estimate the memory you are using.
Also compare to the information that the free
command reports.
Yes, I tried smaps and there is no shared memory, you can try it
~ $ ps auxf | grep test.py
root 5903 1.0 0.1 254408 20288 pts/0 S+ 09:46 0:00 | \_ python test.py
root 5336 0.9 0.1 254408 20296 pts/6 S+ 09:46 0:00 | \_ python test.py
root 14094 0.0 0.0 112808 964 pts/2 S+ 09:47 0:00 \_ grep --color test.py
~ $ sfd_mem_smaps.py 5336
[273] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e5728000-7f63e5928000 ---p 00008000 08:03 657817 /usr/lib64/python2.7/lib-dynload/arraymodule.so
[272] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e59f2000-7f63e5bf1000 ---p 00007000 08:02 281327 /usr/share/Python-2.7/lib/site-packages/pymongo/_cmessage.so
[271] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e5c77000-7f63e5e76000 ---p 00004000 08:03 657433 /usr/lib64/python2.7/lib-dynload/zlibmodule.so
[270] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e5f0f000-7f63e610e000 ---p 00096000 08:03 657213 /usr/lib64/python2.7/lib-dynload/unicodedata.so
[269] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e61b0000-7f63e63af000 ---p 0000e000 08:03 779196 /usr/lib64/python2.7/site-packages/bson/_cbson.so
[268] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e6434000-7f63e6633000 ---p 00004000 08:03 607024 /usr/lib64/libuuid.so.1.3.0
[267] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e6638000-7f63e6837000 ---p 00003000 08:03 657684 /usr/lib64/python2.7/lib-dynload/_randommodule.so
[266] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e683d000-7f63e6a3c000 ---p 00004000 08:03 657496 /usr/lib64/python2.7/lib-dynload/_hashlib.so
[265] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e6a46000-7f63e6c45000 ---p 00007000 08:03 657122 /usr/lib64/python2.7/lib-dynload/math.so
[264] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e6ca4000-7f63e6ea3000 ---p 0001c000 08:03 657054 /usr/lib64/python2.7/lib-dynload/_io.so
[263] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e6eb2000-7f63e70b1000 ---p 00004000 08:03 657646 /usr/lib64/python2.7/lib-dynload/_localemodule.so
[262] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e70bd000-7f63e72bd000 ---p 0000a000 08:02 319827 /usr/share/Python-2.7/lib/site-packages/gevent/__ident.so
[261] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e72fd000-7f63e74fc000 ---p 0003e000 08:02 319813 /usr/share/Python-2.7/lib/site-packages/gevent/_greenlet.so
[260] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e7522000-7f63e7722000 ---p 0001d000 08:02 319828 /usr/share/Python-2.7/lib/site-packages/gevent/__hub_primitives.so
[259] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e7739000-7f63e7938000 ---p 00013000 08:02 319835 /usr/share/Python-2.7/lib/site-packages/gevent/__waiter.so
[258] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e7947000-7f63e7b46000 ---p 0000c000 08:02 319756 /usr/share/Python-2.7/lib/site-packages/gevent/__greenlet_primitives.so
[257] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e7b4e000-7f63e7d4e000 ---p 00006000 08:02 280989 /usr/share/Python-2.7/lib/site-packages/greenlet.so
[256] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e7d5b000-7f63e7f5a000 ---p 0000c000 08:02 319737 /usr/share/Python-2.7/lib/site-packages/gevent/__hub_local.so
[255] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e7fa7000-7f63e81a7000 ---p 0004b000 08:02 319797 /usr/share/Python-2.7/lib/site-packages/gevent/libev/corecext.so
[254] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e81c1000-7f63e83c1000 ---p 00012000 08:03 657820 /usr/lib64/python2.7/lib-dynload/cPickle.so
[253] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e83c9000-7f63e85c8000 ---p 00006000 08:03 657060 /usr/lib64/python2.7/lib-dynload/_multiprocessing.so
[252] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e85cd000-7f63e87cc000 ---p 00003000 08:02 319961 /usr/share/Python-2.7/lib/site-packages/psutil/_psutil_posix.so
[251] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e87d3000-7f63e89d2000 ---p 00005000 08:02 319963 /usr/share/Python-2.7/lib/site-packages/psutil/_psutil_linux.so
[250] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e89d6000-7f63e8bd5000 ---p 00002000 08:03 657088 /usr/lib64/python2.7/lib-dynload/grpmodule.so
[249] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e8c37000-7f63e8e37000 ---p 00060000 08:03 606957 /usr/lib64/libpcre.so.1.2.0
[248] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e8e5d000-7f63e905c000 ---p 00024000 08:03 606966 /usr/lib64/libselinux.so.1
[247] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e9076000-7f63e9276000 ---p 00016000 08:03 611360 /usr/lib64/libresolv-2.17.so
[246] 8.0 (B) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e9278000-7f63e927a000 rw-p 00000000 00:00 0
[245] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e927d000-7f63e947c000 ---p 00003000 08:03 607714 /usr/lib64/libkeyutils.so.1.5
[244] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e948b000-7f63e968b000 ---p 0000d000 08:03 608070 /usr/lib64/libkrb5support.so.0.1
[243] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e96a2000-7f63e98a1000 ---p 00015000 08:03 611140 /usr/lib64/libz.so.1.2.7
[242] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63e98bc000-7f63e9abb000 ---p 00019000 08:03 608058 /usr/lib64/libk5crypto.so.3.1
[241] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63f01a5000-7f63f03a4000 ---p 00003000 08:03 606989 /usr/lib64/libcom_err.so.2.1
[240] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63f047f000-7f63f067e000 ---p 000d9000 08:03 608068 /usr/lib64/libkrb5.so.3.3
[239] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63f06d9000-7f63f08d9000 ---p 0004a000 08:03 608054 /usr/lib64/libgssapi_krb5.so.2.2
[238] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63f0b15000-7f63f0d14000 ---p 00239000 08:03 612610 /usr/lib64/libcrypto.so.1.0.2t
[237] 2.0 (M) 0.0 (B) 0.0 (B) 0.0(B) 0.0(B) 0.0 (B) :7f63f0da8000-7f63f0fa8000 ---p 00067000 08:03 611348 /usr/lib64/libssl.so.1.0.2t
...
[ 16] 620.0(B) 412.0(B) 412.0(B) 0.0(B) 0.0(B) 0.0 (B) :7f63f45f5000-7f63f4690000 r-xp 000e3000 08:03 612969 /usr/lib64/libpython2.7.so.1.0
[ 15] 512.0(B) 512.0(B) 0.0 (B) 0.0(B) 0.0(B) 512.0(B) :7f63e5bf3000-7f63e5c73000 rw-p 00000000 00:00 0
[ 14] 512.0(B) 512.0(B) 0.0 (B) 0.0(B) 0.0(B) 512.0(B) :7f63e6122000-7f63e61a2000 rw-p 00000000 00:00 0
[ 13] 512.0(B) 512.0(B) 0.0 (B) 0.0(B) 0.0(B) 512.0(B) :7f63e63b0000-7f63e6430000 rw-p 00000000 00:00 0
[ 12] 512.0(B) 512.0(B) 0.0 (B) 0.0(B) 0.0(B) 512.0(B) :7f63f2c3c000-7f63f2cbc000 rw-p 00000000 00:00 0
[ 11] 1.8 (M) 660.0(B) 660.0(B) 0.0(B) 0.0(B) 0.0 (B) :7f63f381f000-7f63f39e2000 r-xp 00000000 08:03 606792 /usr/lib64/libc-2.17.so
[ 10] 772.0(B) 668.0(B) 668.0(B) 0.0(B) 0.0(B) 0.0 (B) :7f63f2cbc000-7f63f2d7d000 r-xp 00000000 08:03 696416 /usr/lib64/python2.7/pytransform_bootstrap/pytransform/_pytransform.so
[ 9] 2.2 (M) 708.0(B) 708.0(B) 0.0(B) 0.0(B) 0.0 (B) :7f63f08dc000-7f63f0b15000 r-xp 00000000 08:03 612610 /usr/lib64/libcrypto.so.1.0.2t
[ 8] 768.0(B) 768.0(B) 0.0 (B) 0.0(B) 0.0(B) 768.0(B) :7f63e592b000-7f63e59eb000 rw-p 00000000 00:00 0
[ 7] 768.0(B) 768.0(B) 0.0 (B) 0.0(B) 0.0(B) 768.0(B) :7f63f493b000-7f63f49fb000 rw-p 00000000 00:00 0
[ 6] 788.0(B) 788.0(B) 0.0 (B) 0.0(B) 0.0(B) 788.0(B) :7f63f4a2c000-7f63f4af1000 rw-p 00000000 00:00 0
[ 5] 904.0(B) 796.0(B) 796.0(B) 0.0(B) 0.0(B) 0.0 (B) :7f63f4512000-7f63f45f4000 r-xp 00000000 08:03 612969 /usr/lib64/libpython2.7.so.1.0
[ 4] 1.0 (M) 1.0 (M) 0.0 (B) 0.0(B) 0.0(B) 1.0 (M) :7f63f2935000-7f63f2a36000 rw-p 00000000 00:00 0
[ 3] 1.5 (M) 1.5 (M) 0.0 (B) 0.0(B) 0.0(B) 1.5 (M) :7f63f0022000-7f63f01a2000 rw-p 00000000 00:00 0
[ 2] 4.8 (M) 4.6 (M) 0.0 (B) 0.0(B) 0.0(B) 4.6 (M) :00946000-00e09000 rw-p 00000000 00:00 0 [heap]
[ 1] Vss Rss Shared_Clean Shared_Dirty Private_Clean Private_Dirty :addr
Two possibilities come to mind:
- Python is a language. CPython is the most commonly used implementation of that language. There are others. Micropython is an implementation of Python that is intended for constrained resource environments.
- You could try using shared memory explicitly. IOW, you could use something like multiprocessing.shared_memory — Shared memory for direct access across processes — Python 3.12.1 documentation
HTH.
Thank you very much for your suggestion, I will consider your suggestion and think about how to land on my side
I use this on a daily basis on a centos with python and there is always shared memory for the python C code and any C extensions I use.
Can you post your pss script code for review?
Here is the example code I tested
I mean the code of sfd_mem_smaps.py that you claim shows no shared memory use.
/proc//smaps not shared memory,my script read smaps
#!/usr/bin/python
# coding=utf-8
import sys
def format_unit(x):
x = float(x)
if x > 1024 * 1024:
u = "G"
x /= 1024 * 1024
elif x > 1024:
u = "M"
x /= 1024
else:
u = "B"
return {
"val": "%.1f" % x,
"unit": u
}
if __name__ == '__main__':
if len(sys.argv) < 2 or "-h" in sys.argv or "--help" in sys.argv:
print("Usage:")
print(" %s pid [sort_by] [-h|--help]" % sys.argv[0])
sys.exit(1)
pid = sys.argv[1]
with open("/proc/%s/smaps" % pid) as fh:
lines = fh.readlines()
table = [
"addr",
"Vss",
"Rss",
"Pss",
"Shared_Clean",
"Shared_Dirty",
"Private_Clean",
"Private_Dirty",
"Referenced",
"Anonymous",
"AnonHugePages",
"Swap",
"KernelPageSize",
"MMUPageSize",
"Locked",
"ProtectionKey",
"VmFlags"
]
sort_by = "Rss"
if len(sys.argv) > 2:
if sys.argv[2] in table:
sort_by = sys.argv[2]
top_n = -1
if len(sys.argv) > 3:
top_n = int(sys.argv[3])
sort_by = table.index(sort_by)
# pop ProtectionKey
ProtectionKey_i = len(table) - 2
if "ProtectionKey" not in lines[ProtectionKey_i]:
table.pop(ProtectionKey_i)
info = list()
i = 0
while i < len(lines):
one = list()
# addr
one.append(lines[i])
k = 1
while k < len(table) - 1:
val = lines[i + k].split()[1]
one.append(val)
k += 1
# VmFlags
one.append(lines[i + k])
info.append(one)
i += k + 1
info.sort(key=lambda s: float(s[sort_by]))
out_len = dict()
out = list()
for one in info:
out_one = {
"addr": one[0].replace("\n", ""),
"Vss": format_unit(one[1]),
"Rss": format_unit(one[2]),
"Shared_Clean": format_unit(one[4]),
"Shared_Dirty": format_unit(one[5]),
"Private_Clean": format_unit(one[6]),
"Private_Dirty": format_unit(one[7]),
"Swap": format_unit(one[11]),
}
out.append(out_one)
for k in out_one:
if k not in out_len:
out_len[k] = 0
if k == "addr":
continue
max_len = len(out_one[k]["val"])
if out_len[k] < max_len:
out_len[k] = max_len
title = {
"addr": "addr",
"Vss": "Vss",
"Rss": "Rss",
"Shared_Clean": "Shared_Clean",
"Shared_Dirty": "Shared_Dirty",
"Private_Clean": "Private_Clean",
"Private_Dirty": "Private_Dirty",
"Swap": "Swap"
}
unit_len = len("(G)")
for out_one in out:
for k in out_one:
if k == "addr":
continue
width = out_len[k]
width_title = len(title[k]) - unit_len
out_one[k] = "%-*s(%s) " % (width, out_one[k]["val"], out_one[k]["unit"])
if width_title > width:
out_one[k] = out_one[k] + ' ' * (width_title - width)
for k in title:
width_title = len(title[k])
width = out_len[k] + unit_len
if width > width_title:
width_title = width
title[k] = "%-*s " % (width_title, title[k])
out.append(title)
if top_n > 0:
out = out[len(out)-top_n:]
i = len(out)
i_width = len(str(len(out)))
for out_one in out:
i_str = "[%*d]" % (i_width, i)
print("%s %s %s %s %s %s %s :%s" %
(i_str,
out_one["Vss"], out_one["Rss"],
out_one["Shared_Clean"], out_one["Shared_Dirty"],
out_one["Private_Clean"], out_one["Private_Dirty"],
out_one["addr"]))
i -= 1