Conflicting binary extensions in different packages

Hello,

I have a project that uses binary extensions. I use auditwheel and delocate to prepare my wheels on Linux and MacOS, and include the libraries that my binary extension relies on. One of them is libomp. As far as I understand, this is how wheels with binary extensions are supposed to be packaged.

I have recently started using torch as one of the dependencies for MacOS x86, and it does include in some versions libomp as well, and libiomp5 on others. Both wouldn’t work as there is a conflict when trying to load this library twice (which happens if I import both packages).

I have a bunch of questions regarding this issue:

  • Isn’t it a best practice for python wheels to be self contained (containing all libraries that it depends on)?
  • If that libomp would have been a system library (I mean located in /lib or something), and python wheels aren’t self contained, would it be loaded twice as well?
  • What happens if two libraries export the same symbol? Would it conflict?

I guess there might be some nasty fixes to make both binary extensions dependent of a single library, but is there a better way to handle this, without one python package having to be aware of other installed packages?

Any document or related topics would help me understand how I should go about this.

Thanks

The “better way” is to make each package be aware of the dependency at compile time, when hopefully they are source-compatible enough to be able to share the same one.

The problem you’re running into is the fundamental issue with distributing binaries (on Windows it would be called “DLL Hell”). There are a variety of hacks and tricks you can use to make things work regardless, but the only correct way is to compile everything in a consistent environment so they share dependencies rather than trying to bring their own copies.

It depends. If it were a system library, and both packages depended on it being available as a system library, then they would both be in the “consistent environment” I mention above. So your instincts here are correct (assuming I’m reading your instinct correctly), and this would be a better way forward.

Unfortunately, due to the variety of (particularly Linux) systems out there, virtually nobody can really rely on system libraries. Historically, packages have simply failed until you install the correct system library in the correct (for that package) location, or recompile with your own location. Now we’re in a slightly different place where packages bring their own copy and use it, which is fine until you reach a situation like yours.

This page is a pretty good writeup of the background and has more references. No solutions yet, unfortunately, other than building all the packages from source in a consistent environment (or using a package repository that has done this, like conda-forge or Anaconda).

https://pypackaging-native.github.io/key-issues/abi/

5 Likes

Thanks for the detailed answer Steve! This clears all my questions

I thought that delocate was supposed to fix this so that it worked. I’ve just checked in a wheel that I built with delocate (cibuildwheel) and I see that the .dylibs folder contains bundled copies of the .dylib files but I was surprised to see that they still have their original names:

  inflating: flint/.dylibs/libflint-17.dylib  
  inflating: flint/.dylibs/libgmp.10.dylib  
  inflating: flint/.dylibs/libmpfr.6.dylib  
  inflating: flint/.dylibs/libarb-2.14.0.dylib 

In the manylinux and windows wheels I made these names are all mangled:

  inflating: python_flint.libs/libgmp-ee046dcd.so.10.4.1  
  inflating: python_flint.libs/libarb-82fe75b3.so.2.14.0  
  inflating: python_flint.libs/libmpfr-90ec1309.so.6.1.0  
  inflating: python_flint.libs/libflint-916b483e.so.17.0.0 

Likewise my Windows wheel was repaired by repairwheel and has:

  inflating: python_flint.libs/libarb-2-3e2ccca6121dcab32a5334c50f2ffdcd.dll  
  inflating: python_flint.libs/libflint-17-5a92d97e8523adb508a027ff23c24d68.dll  
  inflating: python_flint.libs/libgcc_s_seh-1-86cbedd0a1a581170a625264eaeff8cc.dll  
  inflating: python_flint.libs/libgmp-10-de56fc2826f65d04df9076813db46939.dll  
  inflating: python_flint.libs/libmpfr-6-580b96fe433a883757467b1aa5c37b38.dll  
  inflating: python_flint.libs/libwinpthread-1.dll

(Not sure why libwinpthread isn’t mangled. Need to fix that.)

My understanding is that this mangling of .dll/.so names ensures that there will not be a conflict between different wheels bundling the same shared libraries. Does that not work on OSX?

it’s not a naming issue here. Both libraries are being loaded just fine. Each of the packages is including all the libraries it needs, and is using its own libraries, and not the one from other packages.

The issue is that two packages are having each an OpenMP library, and this library is built in a way that it can’t be loaded two time in memory. So if you import both packages, which then loads the two OpenMP libraries from the two different packages, you get

OMP: Error #15: Initializing libomp.dylib, but found libiomp5.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://openmp.llvm.org/

The quick and dirty fix I was able to do is to create a symlink from one lib to the other, so that it’s considered a single library, thus not loaded two times.

An imaginary solution for this kind of problem is to be able to isolate loaded libraries per binary extension. Like my binary extension would load and use only its linked OpenMP library, while torch would use its own without interfering. Perhaps a technology of the future :slight_smile: ?

1 Like

The name mangling is necessary on Windows and Linux, so the tools do it there. It’s not necessary on macOS, so the tools don’t do it there. The three platforms each have their own totally independent way of doing shared libraries – they accomplish similar things in the end, but the details are all different.

Ah, this is a slightly different kind of problem than what I imagined. You would have to convince one of the two packages to change to use the other OpenMP runtime, which may not be source compatible.

Best way to do this (today and likely forever) is with separate processes. Depending on how the libraries are being used, multiprocessing is likely your best bet, but subprocess may be sufficient if it looks more like sequential tasks (i.e. you run a big job with Torch and then serialise the result to return back for later use).

I figured after reading what you referenced and found BLAS, LAPACK and OpenMP - pypackaging-native

Indeed, but why I said imaginary is because I meant this is handled automatically, and the user doesn’t have to write the multiprocessing logic :slight_smile: but more seriously, I like the “Potential solutions or mitigations” part, and I think those are more probable solutions to land than what I just described :slight_smile: