Large RAM spike when caching large wheels

While debugging some long standing build issues on Read the Docs with projects that depend on Tensorflow, PyTorch and other projects that ship “fat” wheels, I found it suspicious that pip install would get “killed due to excessive memory consumption” right after downloading the wheel, given that the build environment usually has 3 GB of RAM.

As an experiment, I did pip install torch in a python:3.8 Docker container (pip 20.0.2) and measured the peak RAM usage of the process. It turns out that torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl weighs 776.8 MB, the peak memory happened right after the download finished (before actual installing took place), and it was 2323880 kB, ~2.32 GB:

root@a1cca811e35f:/# grep VmPeak /proc/16/status
VmPeak:  2323880 kB

In another experiment, I added the --no-cache-dir option and the peak memory was only 87616 kB, ~87.6 MB.

And finally, I ran again the pip install torch and interrupted the process when the memory usage spiked to try to get a traceback, and got this:

  https://files.pythonhosted.org:443 "GET /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl HTTP/1.1" 200 776818711                                         [9078/9513]
  Downloading torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl (776.8 MB)                                                                                                                                                                          
     |████████████████████████████████| 776.8 MB 61.7 MB/s eta 0:00:01  Ignoring unknown cache-control directive: immutable                                                                                                                   
  Updating cache with response from "https://files.pythonhosted.org/packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl"
  Caching due to etag                                                                                                                                                                                                                         
^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C
Cleaning up...                                                                                                         
Removed build tracker: '/tmp/pip-req-tracker-28aao4eh'                                                                                                                                                                                        
ERROR: Operation cancelled by user                                                                                                                                                                                                            
Exception information:                                                                                                                                                                                                                        
Traceback (most recent call last):                                                                                                                                                                                                            
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/cli/base_command.py", line 186, in _main
    status = self.run(options, args)                                                                                                                                                                                                          
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/commands/install.py", line 331, in run                                                                                                                                           
    resolver.resolve(requirement_set)                                                                                  
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/legacy_resolve.py", line 177, in resolve                                                                                                                                         
    discovered_reqs.extend(self._resolve_one(requirement_set, req))  
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/legacy_resolve.py", line 333, in _resolve_one                                                                                                                                    
    abstract_dist = self._get_abstract_dist_for(req_to_install)                                                                                                                                                                               
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/legacy_resolve.py", line 282, in _get_abstract_dist_for                                                                                                                          
    abstract_dist = self.preparer.prepare_linked_requirement(req)                                                     
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/operations/prepare.py", line 480, in prepare_linked_requirement
    local_path = unpack_url(                                                                                           
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/operations/prepare.py", line 282, in unpack_url
    return unpack_http_url(
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/operations/prepare.py", line 158, in unpack_http_url
    from_path, content_type = _download_http_url(
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/operations/prepare.py", line 303, in _download_http_url
    for chunk in download.chunks:
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/utils/ui.py", line 160, in iter
    for x in it:
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/network/utils.py", line 15, in response_chunks
    for chunk in response.raw.stream(
  File "/usr/local/lib/python3.8/site-packages/pip/_vendor/urllib3/response.py", line 564, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/usr/local/lib/python3.8/site-packages/pip/_vendor/urllib3/response.py", line 507, in read
    data = self._fp.read(amt) if not fp_closed else b""
  File "/usr/local/lib/python3.8/site-packages/pip/_vendor/cachecontrol/filewrapper.py", line 65, in read
    self._close()
  File "/usr/local/lib/python3.8/site-packages/pip/_vendor/cachecontrol/filewrapper.py", line 52, in _close
    self.__callback(self.__buf.getvalue())
  File "/usr/local/lib/python3.8/site-packages/pip/_vendor/cachecontrol/controller.py", line 308, in cache_response
    self.cache.set(
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/network/cache.py", line 73, in set
    f.write(value)
  File "/usr/local/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/utils/filesystem.py", line 103, in adjacent_tmp_file
    os.fsync(result.file.fileno())
  File "/usr/local/lib/python3.8/site-packages/pip/_internal/utils/ui.py", line 119, in handle_sigint
    self.original_handler(signum, frame)
KeyboardInterrupt

Is this something worth reporting in pip?

Yes please! Please do file an issue for this at github.com/pypa/pip/issues. I don’t think there’s anything like this already, but it won’t hurt if you could search for duplicates before filing an issue. :slight_smile:

Thanks! ^.^

1 Like

Done Large RAM spike when caching large wheels · Issue #9678 · pypa/pip · GitHub thanks @pradyunsg !