Good way to find memory used by a python program?

  1. My config: Python 3.9, Pycharm CE 2023.3.2, on Windows 10 Pro. I’m using Qt Designer and pyuic6 6.4.2. I’m new to Python and have not yet completed a 70 hour tutorial on Python. (I’m using Python 3.9 for the long tutorial I’m doing. I will upgrade Python later.)
  2. Online/Production config: My eventual config will use Microsoft Azure to make a web app. The virtual machine that will run this will likely have limited memory of 1GB.

My dev machine is a Windows PC with 32GB of RAM. The problem here is one of my tasks includes reading a 12GB 12MB CSV file and I’m concerned I will run out of memory on the Azure production machine with 1GB of memory. I don’t know if it uses paged memory when RAM is used up.

  • How can I find out how much memory my running Python program uses after reading the CSV file?
  • Is Windows Task Manager good enough for this estimate of memory used? Python runs in a cmd.exe box. And I can see the amount of memory used in this cmd.exe box in Windows Task Manager.

Thank you.

Your program doesn’t really use the memory; the Python process (which implements the Python virtual machine, i.e. a bytecode interpreter) does.

It’s not an estimate; it’s authoritative - depending on your definition of “used”. I’m not sure anything more precise is possible. Keep in mind that the operating system will not necessarily reclaim memory just because a program deallocates it. For all the operating system knows, the same process might want to allocate new memory a few microseconds from now. It can’t predict the future, so why should it reclaim the memory if other processes aren’t asking for any?

You can determine the memory usage of individual objects in your code using sys.getsizeof. However, this only accounts for the object itself - not for any attributes accessed with . nor items/elements accessed with []. And because of how dynamic Python is, it really couldn’t do those things - those objects could be shared and there’d be no good way to decide how/where to count them, and the attribute/item/element access process can be overridden anyway.

See also:

You should also see python.exe listed separately in the Task Manager, as long as Python is running. The memory used by cmd.exe really is the memory used by cmd.exe - i.e. by the actual program implementing the terminal window, not by anything that you launch from the command line.

If this is the actual thing you need to know, the best way to find out is to try it.

1 Like

I tried it. My CSV file has 102,700 lines and Python read it in less than a second. I didn’t even notice a delay when it did pandas.read_csv(…).

Memory use seemed to go from 50% to 51% and the cmd window disappeared while my program was running in the debugger.

It’s very fast!

Thanks for all your help!

EDIT: Sorry it’s a 12MB CSV file.

Are you sure? That would imply an average line length of over 125,000 characters (well, bytes; but I’m assuming UTF-8 with mostly English text). Maybe it’s actually a 12MB CSV file, and there was never a significant problem (for reasonably modern computers) in the first place? :slight_smile:

Sorry it was a 12MB CSV file. I have such a headache today.