Python interpreter restart(Py_Finalize) within a script seems to be leaking memory

In our product we use python interpreter. I am trying to solve a memory leak in out product.

Python version: 3.9.10

To simulate the problem, I have a script which does py_initialize to start the intepreter. Then it does some variable initialization, calculations, import packages etc. Then it does a py_finalize. My observation was that interpreter restart is leaking memory and higher the frequency of the restart, more is the memory leak.

If I do gc_collect instead, the memory situation seems to be better. As I understand py_finalize internally calls gc_collect and so I am unable to understand this behaviour.

Memory growth logs from the script:
Interpreter restart for every 120 seconds:

Summary

The memory being used before this program starts is :
VmSize=21756kB, VmRSS=584kB

before starting python interpreter VmSize=21756kB, VmRSS=828kB
after starting python interpreter VmSize=127444kB, VmRSS=5380kB

before stopping python interpreter VmSize=179612kB, VmRSS=8992kB
after stopping python interpreter VmSize=179612kB, VmRSS=9128kB

before starting python interpreter VmSize=179612kB, VmRSS=9160kB
after starting python interpreter VmSize=179612kB, VmRSS=9164kB

before stopping python interpreter VmSize=180008kB, VmRSS=9424kB
after stopping python interpreter VmSize=180008kB, VmRSS=9548kB

before starting python interpreter VmSize=180008kB, VmRSS=9548kB
after starting python interpreter VmSize=180008kB, VmRSS=9508kB

before stopping python interpreter VmSize=180396kB, VmRSS=9768kB
after stopping python interpreter VmSize=180396kB, VmRSS=9912kB

before starting python interpreter VmSize=180396kB, VmRSS=9912kB
after starting python interpreter VmSize=180540kB, VmRSS=9912kB

before stopping python interpreter VmSize=180796kB, VmRSS=10016kB
after stopping python interpreter VmSize=180796kB, VmRSS=10264kB

before starting python interpreter VmSize=180796kB, VmRSS=10264kB
after starting python interpreter VmSize=180940kB, VmRSS=10264kB

before stopping python interpreter VmSize=181196kB, VmRSS=10388kB
after stopping python interpreter VmSize=181196kB, VmRSS=10612kB

before starting python interpreter VmSize=181196kB, VmRSS=10612kB
after starting python interpreter VmSize=181344kB, VmRSS=10612kB

before stopping python interpreter VmSize=181600kB, VmRSS=10756kB
after stopping python interpreter VmSize=181600kB, VmRSS=10992kB

before starting python interpreter VmSize=181600kB, VmRSS=10992kB
after starting python interpreter VmSize=181744kB, VmRSS=10992kB

before stopping python interpreter VmSize=182000kB, VmRSS=11140kB
after stopping python interpreter VmSize=182000kB, VmRSS=11364kB

before starting python interpreter VmSize=182000kB, VmRSS=11364kB
after starting python interpreter VmSize=182144kB, VmRSS=11364kB

before stopping python interpreter VmSize=182400kB, VmRSS=11508kB
after stopping python interpreter VmSize=182400kB, VmRSS=11740kB

before starting python interpreter VmSize=182400kB, VmRSS=11740kB
after starting python interpreter VmSize=182544kB, VmRSS=11740kB

before stopping python interpreter VmSize=182800kB, VmRSS=11884kB
after stopping python interpreter VmSize=182800kB, VmRSS=12120kB

before starting python interpreter VmSize=182800kB, VmRSS=12120kB
after starting python interpreter VmSize=182944kB, VmRSS=12120kB

before stopping python interpreter VmSize=182944kB, VmRSS=12264kB
after stopping python interpreter VmSize=182944kB, VmRSS=12480kB

before starting python interpreter VmSize=182944kB, VmRSS=12480kB
after starting python interpreter VmSize=183088kB, VmRSS=12480kB

before stopping python interpreter VmSize=183344kB, VmRSS=12624kB
after stopping python interpreter VmSize=183344kB, VmRSS=12856kB

before starting python interpreter VmSize=183344kB, VmRSS=12856kB
after starting python interpreter VmSize=183344kB, VmRSS=12856kB

before stopping python interpreter VmSize=183600kB, VmRSS=12856kB
after stopping python interpreter VmSize=183600kB, VmRSS=13104kB

before starting python interpreter VmSize=183600kB, VmRSS=13104kB
after starting python interpreter VmSize=183600kB, VmRSS=13104kB

before stopping python interpreter VmSize=184000kB, VmRSS=13208kB
after stopping python interpreter VmSize=184000kB, VmRSS=13464kB

before starting python interpreter VmSize=184000kB, VmRSS=13464kB
after starting python interpreter VmSize=184148kB, VmRSS=13464kB

before stopping python interpreter VmSize=184404kB, VmRSS=13620kB
after stopping python interpreter VmSize=184404kB, VmRSS=13840kB

before starting python interpreter VmSize=184404kB, VmRSS=13840kB
after starting python interpreter VmSize=184548kB, VmRSS=13840kB

before stopping python interpreter VmSize=184804kB, VmRSS=13984kB
after stopping python interpreter VmSize=184804kB, VmRSS=14232kB

before starting python interpreter VmSize=184804kB, VmRSS=14232kB
after starting python interpreter VmSize=184948kB, VmRSS=14232kB

before stopping python interpreter VmSize=185204kB, VmRSS=14376kB
after stopping python interpreter VmSize=185204kB, VmRSS=14588kB

before starting python interpreter VmSize=185204kB, VmRSS=14588kB
after starting python interpreter VmSize=185348kB, VmRSS=14588kB

before stopping python interpreter VmSize=185604kB, VmRSS=14732kB
after stopping python interpreter VmSize=185604kB, VmRSS=14976kB

before starting python interpreter VmSize=185604kB, VmRSS=14976kB
after starting python interpreter VmSize=185748kB, VmRSS=14976kB

before stopping python interpreter VmSize=186004kB, VmRSS=15120kB
after stopping python interpreter VmSize=186004kB, VmRSS=15344kB

before starting python interpreter VmSize=186004kB, VmRSS=15344kB
after starting python interpreter VmSize=186152kB, VmRSS=15344kB

before stopping python interpreter VmSize=186408kB, VmRSS=15488kB
after stopping python interpreter VmSize=186408kB, VmRSS=15724kB

before starting python interpreter VmSize=186408kB, VmRSS=15724kB
after starting python interpreter VmSize=186408kB, VmRSS=15724kB

before stopping python interpreter VmSize=186408kB, VmRSS=15724kB
after stopping python interpreter VmSize=186408kB, VmRSS=15956kB

before starting python interpreter VmSize=186408kB, VmRSS=15956kB
after starting python interpreter VmSize=186408kB, VmRSS=15956kB

before stopping python interpreter VmSize=186796kB, VmRSS=16324kB
after stopping python interpreter VmSize=186796kB, VmRSS=16328kB

before starting python interpreter VmSize=186796kB, VmRSS=16328kB
after starting python interpreter VmSize=186796kB, VmRSS=16328kB

before stopping python interpreter VmSize=187188kB, VmRSS=16448kB
after stopping python interpreter VmSize=187188kB, VmRSS=16688kB

before starting python interpreter VmSize=187188kB, VmRSS=16688kB
after starting python interpreter VmSize=187324kB, VmRSS=16688kB

before stopping python interpreter VmSize=187580kB, VmRSS=16824kB
after stopping python interpreter VmSize=187580kB, VmRSS=17044kB

before starting python interpreter VmSize=187580kB, VmRSS=17044kB
after starting python interpreter VmSize=187724kB, VmRSS=17044kB

before stopping python interpreter VmSize=187980kB, VmRSS=17188kB
after stopping python interpreter VmSize=187980kB, VmRSS=17424kB

before starting python interpreter VmSize=187980kB, VmRSS=17424kB
after starting python interpreter VmSize=188124kB, VmRSS=17424kB

before stopping python interpreter VmSize=188380kB, VmRSS=17568kB
after stopping python interpreter VmSize=188380kB, VmRSS=17796kB

before starting python interpreter VmSize=188380kB, VmRSS=17796kB
after starting python interpreter VmSize=188524kB, VmRSS=17796kB

before stopping python interpreter VmSize=188780kB, VmRSS=17940kB
after stopping python interpreter VmSize=188780kB, VmRSS=18176kB

before starting python interpreter VmSize=188780kB, VmRSS=18176kB
after starting python interpreter VmSize=188924kB, VmRSS=18176kB

before stopping python interpreter VmSize=189180kB, VmRSS=18320kB
after stopping python interpreter VmSize=189180kB, VmRSS=18552kB

before starting python interpreter VmSize=189180kB, VmRSS=18552kB
after starting python interpreter VmSize=189324kB, VmRSS=18552kB

before stopping python interpreter VmSize=189580kB, VmRSS=18696kB
after stopping python interpreter VmSize=189580kB, VmRSS=18936kB

before starting python interpreter VmSize=189580kB, VmRSS=18936kB
after starting python interpreter VmSize=189580kB, VmRSS=18936kB

before stopping python interpreter VmSize=189836kB, VmRSS=18936kB
after stopping python interpreter VmSize=189836kB, VmRSS=19180kB

before starting python interpreter VmSize=189836kB, VmRSS=19180kB
after starting python interpreter VmSize=189836kB, VmRSS=19180kB

before stopping python interpreter VmSize=189992kB, VmRSS=19284kB
after stopping python interpreter VmSize=189992kB, VmRSS=19540kB

The memory being used now at the end of the process is:
VmSize=189992kB, VmRSS=19540kB

Used script to reproduce this issue:



#include <Python.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <unistd.h>
#include<stdio.h>

void print_memory(void)
{

static char buf[128] = {'\0'};
long vmsize, vmrss;
int rc;
FILE *fp;
fp = fopen("/proc/self/statm", "r");
if (!fp)
return;
  
/* Only fetch the first two entries. */
rc = fscanf(fp, "%ld%ld", &vmsize, &vmrss);
fclose(fp);

if (rc != 2)
return;
  
/* Convert to KB. */
vmsize *= 4;
vmrss *= 4;
snprintf(buf, sizeof(buf), "VmSize=%ldkB, VmRSS=%ldkB", vmsize, vmrss);
printf("VmSize=%ldkB, VmRSS=%ldkB\n", vmsize, vmrss);
return;

}

void start_interpreter(){
printf("before starting python interpreter\t");
print_memory();
Py_Initialize();
printf("after starting python interpreter\t");
print_memory();
return;
}

void close_interpreter(){
printf("before stopping python interpreter\t");
print_memory();
Py_Finalize();
printf("after stopping python interpreter\t");
print_memory();
return;
}





int main() {
printf("The memory being used before this program starts is : \n");
print_memory();

printf("\n");
printf("\n");
printf("\n");
int i;
int j;

int time_passed_till_now=0;

for(i=0;i<30;i++){

   start_interpreter();
  PyRun_SimpleString("from time import time,ctime\n"
                     "print ('Today is ctime(time())')\n");
  PyRun_SimpleString("from time import time,ctime\n"
                     "print ('Today is ctime(time())')\n");
  PyRun_SimpleString("from time import time,ctime\n"
                    "print ('Today is ctime(time())')\n");
 
  
  PyRun_SimpleString("import json as js\n");
  PyRun_SimpleString("a=90\n");
  PyRun_SimpleString("b=70\n");
  PyRun_SimpleString("c=32\n");
  PyRun_SimpleString("c=a+b/5+c*10+a\n");
  PyRun_SimpleString("d=a/2+a*5+b*a/5+c*10+a%4\n");
  PyRun_SimpleString("a=b+a/2+b%2*b+c-19+a/5+b*43\n");
  PyRun_SimpleString("b=70\n");		     
  		     		      
	
  //PyRun_SimpleString("import matplotlib.pyplot as plt\n");   
      
  PyRun_SimpleString("import cmath\n");              
                     printf("\n");
                     
                     printf("\n");
  PyRun_SimpleString("import hashlib\n");   
  
  
  
     PyRun_SimpleString("import readline\n");
  PyRun_SimpleString("import subprocess\n");
  PyRun_SimpleString("import string\n");      
     
  close_interpreter();
  
  printf("\n");
  
  //Start the python interpreter every 120 seconds.
  sleep(120)
  
  }
  printf("\n");
  printf("\n");
  printf("\n");
  printf("The memory being used now at the end of the process is: \n");
  print_memory();
  
  return 0;
  
}

The same script if we run GC Collect instead of restarting interpreter below is the output:

Summary

The memory being used before this program starts is :
VmSize=21756kB, VmRSS=584kB

before python GC VmSize=179612kB, VmRSS=8992kB
after python GC VmSize=179612kB, VmRSS=9128kB

before python GC VmSize=179612kB, VmRSS=9128kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9132kB

before python GC VmSize=179612kB, VmRSS=9132kB
after python GC VmSize=179612kB, VmRSS=9136kB

before python GC VmSize=179612kB, VmRSS=9136kB
after python GC VmSize=179612kB, VmRSS=9136kB

before python GC VmSize=179612kB, VmRSS=9136kB
after python GC VmSize=179612kB, VmRSS=9136kB

before python GC VmSize=179612kB, VmRSS=9136kB
after python GC VmSize=179612kB, VmRSS=9136kB

The memory being used now at the END OF PROCESS is:
VmSize=179612kB, VmRSS=9136kB

Can some expert here throw some light on how this works?

1 Like

At the very least, we’ll need the exact version of Python you’re using to investigate it. Initialization has been changing a lot over the last few releases.

It would also be nice if you could edit your post to put the output inside [details] tags (or use the “Hide Details” menu item in the Discourse editor in the menu whin you click on the cog button) so that it’s possible to read your post without needing to scroll so much.

2 Likes

Thanks for the reply, I have updated the post.
Python version is 3.9.10.

tl;dr this is a known issue, but improving, and :crossed_fingers: should be mostly resolved before 3.12 releases (Oct 2023).

This problem has been on our radar for more than a decade (e.g. (Py_Finalize() doesn't clear all Python objects at exit · Issue #44470 · python/cpython · GitHub)). The problem is that CPython has a lot of global runtime state, much of it spread out in global variables across the code base (and not just objects). As a result, cleaning up all the allocations during Py_Finalize() isn’t trivial, but instead makes it likely things get missed.

Our approach to global state has shifted in the last 5+ years and we’ve been working on it. This includes consolidating that state into a single data structure, adding mechanisms to track allocations better, and taking advantage of those improvements to properly free memory during runtime/interpreter finalization. Notably, since around 2017/2018, a bunch of contributors have been steadily improving the situation, led in part by @vstinner. (Thanks Victor!)

Consequently, as of 3.11 (in beta now, releasing in Oct 2022), things are much better and we don’t leak at all in some situations. I’m hopeful we’ll be able to get almost completely leak-free in the next year, AKA 3.12. (Post-finalization leaks are a problem for my per-interpreter GIL project, AKA PEP 684, so I’m motivated to see this through.)

Summary: we’re working on it, but it will likely be a problem for you until 3.12 releases. 3.11 is fairly close to leak-free and might be close enough to work for you once it is released. In the meantime, 3.10 (released last October) already leaks less than 3.9 and is probably worth checking.


For reference, here’s a rough timeline to give you a sense of the story (and keep in mind that almost all CPython core contributors are working entirely on their own time):

  • 2007 - GH-44470: one of several reports about Py_Finalize() leaking
  • 2007 (3.0) - PEP 3121: extension module finalization
  • 2010 (3.2) - PEP 384: stable ABI and heap types
  • 2015 (3.5) - PEP 489: mutli-phase init for extension modules
  • 2017 (3.7) - beginning of globals consolidation into _PyRuntimeState (to help with PEP 684)
  • 2018 (3.8) - internal-only implementation of PEP 432
  • 2019 (3.8) - PEP 587 runtime config
  • ~2019 (3.9) - start of effort to port stdlib extension modules to heap types and multi-phase init
  • 2020 (3.10) - more extensions ported, more globals consolidated, more runtime state finalized
  • 2021 (3.11) - more extensions ported, more globals consolidated, more runtime state finalized
  • Jan 2022 (3.11) - “Python no longer leaks memory at exit” (with caveats)
  • 2022 (3.12) - continued work on interpreter isolation and interpreter/runtime finalization

(Also see PEP 573, PEP 630, and others.)

6 Likes

Actually, the first heap types were introduced in Python 3.2, IIRC. A couple more was introduced before 3.8 or 3.9 as well. But, yeah, around 2019, things started moving faster.

3 Likes

Thanks for the response @eric.snow , It’s good to hear the progress on this issue.

Will test it in python 3.10 and 3.11 (once available). :crossed_fingers: for 3.12 :slight_smile:

You can already test with 3.11; the first 3.11 beta is already out :slight_smile: Trying out the beta and release candidates is a great way to help Python improve!

1 Like

Python V3.12 has also that memory leaks when one finalizes it, among the memory leaks the biggest one (about 4Mb pages) occur by not releasing the Arenas Pool. See for
AlexSoft73/cpython

For testing I modified 2 files:

thanks,
Alexei