How to do memory optimization for multiple python scripts

I have a scenario where I have 300 independently running python scripts. There are resident scripts and scheduled scripts. I want to implement a cache of shared imports. Because under my verification, for example, if I have a script import pymongo, the memory usage will be 6M. When I have 100 python scripts that all import pymongo and are running at the same time, my memory usage will reach 600M, so I want to achieve An import cache, let them load from the cache first, share the import of pymongo, save memory, but I don’t know how to achieve it

Do the different scripts ‘talk’ to one another or share memory/data? You may be able to use something like threading.Thread to combine some and have them share the same memory space.

(So in other words, run some scripts in different threads from the same starting process).

Though note that the Global Interpreter Lock (GIL) may lower the degree of parallelism of multiple threads at once.

The different scripts not ‘talk’ to one another or share memory/data.
I have considered the plan you mentioned, it will change me a lot, so I have to give up this plan

Python processes can share .dll or .so memory but cannot share the memory used for python code. Only by using a single process can you avoid this issue.

Even forking from a parent will not work as there are ref counts on code objects that force copy-on-write in the child processes.

For a 600MiB cost i would just pay it, assuming you not on a small embedded system

For the commercial python app i work on the lack of sharing costs us 100’s of GiB per server…

Is there no other solution implemented? Such as shared memory and the like, my device is a small memory system, my resources are very limited, so I am trying to optimize, and cannot increase memory

Python is not designed to share memory of its python code as you appear to need.

If you are using linux then you may find that you have over estimated the needed memory.
The code of python itself is only loaded once into memory.
You pay only for loading the python code and the objects that code needs.

Have you accounted for shared code as counted as PSS?

SHR Significantly smaller memory footprint

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                   
13691 root      20   0  254412  20300   5684 S   0.0  0.1   0:00.53 python test.py                                                                                                                                            
13482 root      20   0  254408  20296   5684 S   0.0  0.1   0:00.53 python test.py 

test.py

import sys
sys.modules
from memory_profiler import profile


@profile
def enter():
    # !/usr/bin/env python
    # encoding: utf-8
    """
    @desc: 资产入库
    """

    import os
    import sys
    import gevent
    import pymongo
    print(pymongo.__file__)
    print(pymongo.__version__)
    import time
    time.sleep(600)


if __name__ == '__main__':
    enter()

profile:

Line #    Mem usage    Increment   Line Contents
================================================
    35     12.7 MiB     12.7 MiB   @profile
    36                             def enter():
    43     12.7 MiB      0.0 MiB       import os
    44     12.7 MiB      0.0 MiB       import sys
    45     13.9 MiB      1.2 MiB       import gevent
    46     19.8 MiB      5.9 MiB       import pymongo
    47     19.8 MiB      0.0 MiB       print(pymongo.__file__)
    48     19.8 MiB      0.0 MiB       print(pymongo.__version__)
    49     19.8 MiB      0.0 MiB       import time
    50     19.8 MiB      0.0 MiB       time.sleep(600)

You can find PSS in the /proc//smap file.

The “proportional set size” (PSS) of a process is the count of pages it has in memory, where each page is divided by the number of processes sharing it. So if a process has 1000 pages all to itself, and 1000 shared with one other process, its PSS will be 1500

SHR in the example about will be 2000.

If you add all the SHR together you over estimate the memory you are using.

Also compare to the information that the free command reports.

Yes, I tried smaps and there is no shared memory, you can try it

~ $ ps auxf | grep test.py
root      5903  1.0  0.1 254408 20288 pts/0    S+   09:46   0:00  |       \_ python test.py
root      5336  0.9  0.1 254408 20296 pts/6    S+   09:46   0:00  |       \_ python test.py
root     14094  0.0  0.0 112808   964 pts/2    S+   09:47   0:00          \_ grep --color test.py
~ $ sfd_mem_smaps.py 5336
[273] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e5728000-7f63e5928000 ---p 00008000 08:03 657817                     /usr/lib64/python2.7/lib-dynload/arraymodule.so
[272] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e59f2000-7f63e5bf1000 ---p 00007000 08:02 281327                     /usr/share/Python-2.7/lib/site-packages/pymongo/_cmessage.so
[271] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e5c77000-7f63e5e76000 ---p 00004000 08:03 657433                     /usr/lib64/python2.7/lib-dynload/zlibmodule.so
[270] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e5f0f000-7f63e610e000 ---p 00096000 08:03 657213                     /usr/lib64/python2.7/lib-dynload/unicodedata.so
[269] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e61b0000-7f63e63af000 ---p 0000e000 08:03 779196                     /usr/lib64/python2.7/site-packages/bson/_cbson.so
[268] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e6434000-7f63e6633000 ---p 00004000 08:03 607024                     /usr/lib64/libuuid.so.1.3.0
[267] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e6638000-7f63e6837000 ---p 00003000 08:03 657684                     /usr/lib64/python2.7/lib-dynload/_randommodule.so
[266] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e683d000-7f63e6a3c000 ---p 00004000 08:03 657496                     /usr/lib64/python2.7/lib-dynload/_hashlib.so
[265] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e6a46000-7f63e6c45000 ---p 00007000 08:03 657122                     /usr/lib64/python2.7/lib-dynload/math.so
[264] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e6ca4000-7f63e6ea3000 ---p 0001c000 08:03 657054                     /usr/lib64/python2.7/lib-dynload/_io.so
[263] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e6eb2000-7f63e70b1000 ---p 00004000 08:03 657646                     /usr/lib64/python2.7/lib-dynload/_localemodule.so
[262] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e70bd000-7f63e72bd000 ---p 0000a000 08:02 319827                     /usr/share/Python-2.7/lib/site-packages/gevent/__ident.so
[261] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e72fd000-7f63e74fc000 ---p 0003e000 08:02 319813                     /usr/share/Python-2.7/lib/site-packages/gevent/_greenlet.so
[260] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e7522000-7f63e7722000 ---p 0001d000 08:02 319828                     /usr/share/Python-2.7/lib/site-packages/gevent/__hub_primitives.so
[259] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e7739000-7f63e7938000 ---p 00013000 08:02 319835                     /usr/share/Python-2.7/lib/site-packages/gevent/__waiter.so
[258] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e7947000-7f63e7b46000 ---p 0000c000 08:02 319756                     /usr/share/Python-2.7/lib/site-packages/gevent/__greenlet_primitives.so
[257] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e7b4e000-7f63e7d4e000 ---p 00006000 08:02 280989                     /usr/share/Python-2.7/lib/site-packages/greenlet.so
[256] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e7d5b000-7f63e7f5a000 ---p 0000c000 08:02 319737                     /usr/share/Python-2.7/lib/site-packages/gevent/__hub_local.so
[255] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e7fa7000-7f63e81a7000 ---p 0004b000 08:02 319797                     /usr/share/Python-2.7/lib/site-packages/gevent/libev/corecext.so
[254] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e81c1000-7f63e83c1000 ---p 00012000 08:03 657820                     /usr/lib64/python2.7/lib-dynload/cPickle.so
[253] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e83c9000-7f63e85c8000 ---p 00006000 08:03 657060                     /usr/lib64/python2.7/lib-dynload/_multiprocessing.so
[252] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e85cd000-7f63e87cc000 ---p 00003000 08:02 319961                     /usr/share/Python-2.7/lib/site-packages/psutil/_psutil_posix.so
[251] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e87d3000-7f63e89d2000 ---p 00005000 08:02 319963                     /usr/share/Python-2.7/lib/site-packages/psutil/_psutil_linux.so
[250] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e89d6000-7f63e8bd5000 ---p 00002000 08:03 657088                     /usr/lib64/python2.7/lib-dynload/grpmodule.so
[249] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e8c37000-7f63e8e37000 ---p 00060000 08:03 606957                     /usr/lib64/libpcre.so.1.2.0
[248] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e8e5d000-7f63e905c000 ---p 00024000 08:03 606966                     /usr/lib64/libselinux.so.1
[247] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e9076000-7f63e9276000 ---p 00016000 08:03 611360                     /usr/lib64/libresolv-2.17.so
[246] 8.0  (B)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e9278000-7f63e927a000 rw-p 00000000 00:00 0 
[245] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e927d000-7f63e947c000 ---p 00003000 08:03 607714                     /usr/lib64/libkeyutils.so.1.5
[244] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e948b000-7f63e968b000 ---p 0000d000 08:03 608070                     /usr/lib64/libkrb5support.so.0.1
[243] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e96a2000-7f63e98a1000 ---p 00015000 08:03 611140                     /usr/lib64/libz.so.1.2.7
[242] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63e98bc000-7f63e9abb000 ---p 00019000 08:03 608058                     /usr/lib64/libk5crypto.so.3.1
[241] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63f01a5000-7f63f03a4000 ---p 00003000 08:03 606989                     /usr/lib64/libcom_err.so.2.1
[240] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63f047f000-7f63f067e000 ---p 000d9000 08:03 608068                     /usr/lib64/libkrb5.so.3.3
[239] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63f06d9000-7f63f08d9000 ---p 0004a000 08:03 608054                     /usr/lib64/libgssapi_krb5.so.2.2
[238] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63f0b15000-7f63f0d14000 ---p 00239000 08:03 612610                     /usr/lib64/libcrypto.so.1.0.2t
[237] 2.0  (M)   0.0  (B)   0.0  (B)       0.0(B)         0.0(B)          0.0  (B)        :7f63f0da8000-7f63f0fa8000 ---p 00067000 08:03 611348                     /usr/lib64/libssl.so.1.0.2t
...
[ 16] 620.0(B)   412.0(B)   412.0(B)       0.0(B)         0.0(B)          0.0  (B)        :7f63f45f5000-7f63f4690000 r-xp 000e3000 08:03 612969                     /usr/lib64/libpython2.7.so.1.0
[ 15] 512.0(B)   512.0(B)   0.0  (B)       0.0(B)         0.0(B)          512.0(B)        :7f63e5bf3000-7f63e5c73000 rw-p 00000000 00:00 0 
[ 14] 512.0(B)   512.0(B)   0.0  (B)       0.0(B)         0.0(B)          512.0(B)        :7f63e6122000-7f63e61a2000 rw-p 00000000 00:00 0 
[ 13] 512.0(B)   512.0(B)   0.0  (B)       0.0(B)         0.0(B)          512.0(B)        :7f63e63b0000-7f63e6430000 rw-p 00000000 00:00 0 
[ 12] 512.0(B)   512.0(B)   0.0  (B)       0.0(B)         0.0(B)          512.0(B)        :7f63f2c3c000-7f63f2cbc000 rw-p 00000000 00:00 0 
[ 11] 1.8  (M)   660.0(B)   660.0(B)       0.0(B)         0.0(B)          0.0  (B)        :7f63f381f000-7f63f39e2000 r-xp 00000000 08:03 606792                     /usr/lib64/libc-2.17.so
[ 10] 772.0(B)   668.0(B)   668.0(B)       0.0(B)         0.0(B)          0.0  (B)        :7f63f2cbc000-7f63f2d7d000 r-xp 00000000 08:03 696416                     /usr/lib64/python2.7/pytransform_bootstrap/pytransform/_pytransform.so
[  9] 2.2  (M)   708.0(B)   708.0(B)       0.0(B)         0.0(B)          0.0  (B)        :7f63f08dc000-7f63f0b15000 r-xp 00000000 08:03 612610                     /usr/lib64/libcrypto.so.1.0.2t
[  8] 768.0(B)   768.0(B)   0.0  (B)       0.0(B)         0.0(B)          768.0(B)        :7f63e592b000-7f63e59eb000 rw-p 00000000 00:00 0 
[  7] 768.0(B)   768.0(B)   0.0  (B)       0.0(B)         0.0(B)          768.0(B)        :7f63f493b000-7f63f49fb000 rw-p 00000000 00:00 0 
[  6] 788.0(B)   788.0(B)   0.0  (B)       0.0(B)         0.0(B)          788.0(B)        :7f63f4a2c000-7f63f4af1000 rw-p 00000000 00:00 0 
[  5] 904.0(B)   796.0(B)   796.0(B)       0.0(B)         0.0(B)          0.0  (B)        :7f63f4512000-7f63f45f4000 r-xp 00000000 08:03 612969                     /usr/lib64/libpython2.7.so.1.0
[  4] 1.0  (M)   1.0  (M)   0.0  (B)       0.0(B)         0.0(B)          1.0  (M)        :7f63f2935000-7f63f2a36000 rw-p 00000000 00:00 0 
[  3] 1.5  (M)   1.5  (M)   0.0  (B)       0.0(B)         0.0(B)          1.5  (M)        :7f63f0022000-7f63f01a2000 rw-p 00000000 00:00 0 
[  2] 4.8  (M)   4.6  (M)   0.0  (B)       0.0(B)         0.0(B)          4.6  (M)        :00946000-00e09000 rw-p 00000000 00:00 0                                  [heap]
[  1] Vss        Rss        Shared_Clean   Shared_Dirty   Private_Clean   Private_Dirty   :addr    

Two possibilities come to mind:

  1. Python is a language. CPython is the most commonly used implementation of that language. There are others. Micropython is an implementation of Python that is intended for constrained resource environments.
  2. You could try using shared memory explicitly. IOW, you could use something like multiprocessing.shared_memory — Shared memory for direct access across processes — Python 3.12.1 documentation

HTH.

Thank you very much for your suggestion, I will consider your suggestion and think about how to land on my side

I use this on a daily basis on a centos with python and there is always shared memory for the python C code and any C extensions I use.

Can you post your pss script code for review?

Here is the example code I tested

I mean the code of sfd_mem_smaps.py that you claim shows no shared memory use.

/proc//smaps not shared memory,my script read smaps

#!/usr/bin/python
# coding=utf-8

import sys


def format_unit(x):
    x = float(x)
    if x > 1024 * 1024:
        u = "G"
        x /= 1024 * 1024
    elif x > 1024:
        u = "M"
        x /= 1024
    else:
        u = "B"

    return {
        "val": "%.1f" % x,
        "unit": u
    }


if __name__ == '__main__':
    if len(sys.argv) < 2 or "-h" in sys.argv or "--help" in sys.argv:
        print("Usage:")
        print("    %s  pid [sort_by] [-h|--help]" % sys.argv[0])
        sys.exit(1)

    pid = sys.argv[1]

    with open("/proc/%s/smaps" % pid) as fh:
        lines = fh.readlines()
    table = [
        "addr",
        "Vss",
        "Rss",
        "Pss",
        "Shared_Clean",
        "Shared_Dirty",
        "Private_Clean",
        "Private_Dirty",
        "Referenced",
        "Anonymous",
        "AnonHugePages",
        "Swap",
        "KernelPageSize",
        "MMUPageSize",
        "Locked",
        "ProtectionKey",
        "VmFlags"
    ]

    sort_by = "Rss"
    if len(sys.argv) > 2:
        if sys.argv[2] in table:
            sort_by = sys.argv[2]

    top_n = -1
    if len(sys.argv) > 3:
        top_n = int(sys.argv[3])

    sort_by = table.index(sort_by)

    # pop ProtectionKey
    ProtectionKey_i = len(table) - 2
    if "ProtectionKey" not in lines[ProtectionKey_i]:
        table.pop(ProtectionKey_i)

    info = list()
    i = 0
    while i < len(lines):
        one = list()
        # addr
        one.append(lines[i])
        k = 1
        while k < len(table) - 1:
            val = lines[i + k].split()[1]
            one.append(val)
            k += 1
        # VmFlags
        one.append(lines[i + k])
        info.append(one)
        i += k + 1

    info.sort(key=lambda s: float(s[sort_by]))

    out_len = dict()
    out = list()
    for one in info:
        out_one = {
            "addr": one[0].replace("\n", ""),
            "Vss": format_unit(one[1]),
            "Rss": format_unit(one[2]),
            "Shared_Clean": format_unit(one[4]),
            "Shared_Dirty": format_unit(one[5]),
            "Private_Clean": format_unit(one[6]),
            "Private_Dirty": format_unit(one[7]),
            "Swap": format_unit(one[11]),
        }
        out.append(out_one)

        for k in out_one:
            if k not in out_len:
                out_len[k] = 0

            if k == "addr":
                continue

            max_len = len(out_one[k]["val"])
            if out_len[k] < max_len:
                out_len[k] = max_len

    title = {
        "addr": "addr",
        "Vss": "Vss",
        "Rss": "Rss",
        "Shared_Clean": "Shared_Clean",
        "Shared_Dirty": "Shared_Dirty",
        "Private_Clean": "Private_Clean",
        "Private_Dirty": "Private_Dirty",
        "Swap": "Swap"
    }

    unit_len = len("(G)")
    for out_one in out:
        for k in out_one:
            if k == "addr":
                continue
            width = out_len[k]
            width_title = len(title[k]) - unit_len

            out_one[k] = "%-*s(%s)  " % (width, out_one[k]["val"], out_one[k]["unit"])
            if width_title > width: 
                out_one[k] = out_one[k] + ' ' * (width_title - width)

    for k in title:
        width_title = len(title[k])
        width = out_len[k] + unit_len
        if width > width_title:
            width_title = width
        title[k] = "%-*s  " % (width_title, title[k])

    out.append(title)

    if top_n > 0:
        out = out[len(out)-top_n:]

    i = len(out)
    i_width = len(str(len(out)))
    for out_one in out:
        i_str = "[%*d]" % (i_width, i)
        print("%s %s %s %s %s %s %s :%s" %
              (i_str,
               out_one["Vss"], out_one["Rss"],
               out_one["Shared_Clean"], out_one["Shared_Dirty"],
               out_one["Private_Clean"], out_one["Private_Dirty"],
               out_one["addr"]))
        i -= 1