Wheel depending on shared library from another wheel

jeanas · May 3, 2023, 4:52pm

This is the continuation of Packaging a C extension with a dependency on another C extension

As explained in that thread, I’m trying to produce wheels for python-poppler-qt5, which is in this situation:

                          tons of stuff
                         (libfreetype, libjpeg,
                        libpcre2, libtiff, etc.)
                                  ^
                                  |  depends on
                                  |
                  depends on      |
Qt5 (C++) <------------------ poppler-qt5 (C++)
 ^                                ^
 |                                |
 | is Python                      | is Python
 | binding for                    | binding for
 |                                |
PyQt5 <-----------------------python-poppler-qt5
(built with sip)     reuses         (built with sip)

After versioning, I’m trying to understand how dynamic library loading is supposed to work.

My ultimate goal is to produce a wheel that the library user can just install. Potentially, the PyQt5 package might be installed in a totally different place than python-poppler-qt5. For example: the user could do pip install PyQt5 outside of a virtual environment, then create a virtual environment with --system-site-packages and install python-poppler-qt5 there. The path where the Qt5 dynamic libraries (shipped in PyQt5 wheels) needs to be resolved at runtime by the Python interpreter with its import system.

When it wants to load an extension module, CPython presumably has no trouble taking it where it wants since this is done dynamically with dlopen. But how is that supposed to work with dependencies that are loaded by the OS’s dynamic linker (edit: as dependencies of libpoppler-qt5)? My understanding is that it will look into LD_LIBRARY_PATH, use the RPATH, and such (depending on the OS), but how would it know about where Python looks for modules?

Keep in mind that I’m merely someone trying to make installation less painful for an app and not an expert on shared libraries at all, so this could be a dumb question. I appreciate pointers to useful resources.

(I could also ask on the PyQt mailing list; however, I’m asking here because I believe this package is not alone in this sort of situation. A similar case that I can think of is packages using the NumPy C API.)

Thanks!

steve.dower · May 3, 2023, 5:06pm

Python looks in sys.path for extension modules, and then their dependencies are loaded according to the system’s rules (which as you note, vary).

Generally, I believe most systems will look adjacent to the module, which means if dependencies are in the same directory as the extension module, they’ll be found.

If you set up something more complex, generally you’d need a .py to be imported first that can set up additional environment variables or settings before importing the extension module. For example almost the entire contents of zmq/__init__.py is doing this setup.

Alternatively, if you only care about the app, you should be able to move the dependencies to the main executable’s directory and they’ll be found there. This means manually fixing up the file layout, but if you’re trying to ship an app ready to go (rather than expecting package management tools to generate the layout for you), then this is pretty easy to do.

jeanas · May 3, 2023, 6:17pm

Ok… sort of a YOLO approach, but I guess issues should be rare…

(In the meantime, I looked at wheels of the qscintilla package. They appear to be doing just this.)

Interesting. I read “Works around mysterious issue where os.add_dll_directory does not resolve imports (conda-forge Python >= 3.8)”. It’s obviously a system-dependent approach, but it might work. (What would be really great is Python providing a platform-independent abstraction for this sort of thing, sort of like the existing Windows-specific os.add_dll_directory.)

Alternatively, in my understanding, Conda never scatters environments across the file system, so I guess that would be more reliable. Except that I’m keen on recommending pip install -e for contributors to install the app (both for developing and for their own use) because of its support (via the wheel format) for installing files at arbitrary locations under ~/.local on Linux, which we need in order to install the XDG desktop file, app icon and man page. Oh well.

This is what we’re doing right now. We’re getting stuck in packaging issues literally on every release, so I’d like to get rid of those fiddly manual manipulations.

Also, while I mostly care about the app (Frescobaldi), I was also hoping to help all these people along the way.

steve.dower · May 3, 2023, 7:37pm

Anaconda and conda-forge decided to disable support for os.add_dll_directory in their builds of 3.8, which is why it didn’t work. (You have to choose between searching PATH or searching the DLL search path, and they wanted to keep using PATH despite the widely-known security risks.) They’ve fixed that now, so it shouldn’t be a problem again.

Yeah, those problems are very difficult to solve. Pip^[1] is not well equipped for solving this issue, so when a requirement is “must install using pip” then you’re going to have to use a lot of tricks.

An application installer is more typically going to be a platform package, such as an MSI or RPM, etc. that contains all the files. Those are much easier to make this work for users on supported platforms.

And implicitly, all the packaging standards and conventions that it relies upon. ↩︎

jeanas · May 3, 2023, 9:20pm

Thank you for confirming that the problem is hard.

Do you happen to know if extending the PATH variable is portable? (Security is not a concern here from my point of view.)

steve.dower · May 3, 2023, 9:45pm

It’s not. Manually fixing up the file layout is the only portable approach I’m aware of (and I’m only very sure it’s portable - there are no doubt edge cases I’m not aware of).

jeanas · May 3, 2023, 10:39pm

OK, thanks a lot for the help.

BrenBarn · May 3, 2023, 11:41pm

As someone who at least has written an app using PyQt and QScintilla (though maybe less complex than yours), if I were in your situation I would just stipulate conda as the install mechanism (or go a completely different route and create OS-level packages). Conda sidesteps all these issues and is set up to allow separate non-Python dependencies like this. Of course someone still has to package those dependencies, but once they do, other packages can depend on them using the normal dependency mechanism. As @steve.dower mentioned, the “pip stack” is simply not set up for this.

It’s always possible to create a separate install script as part of the package that users can run manually to populate ~/.local with icons, etc. That’s a bit of a hack, but may be a less fraught hack than the contortions that have to be navigated to get it to work with pip.

jeanas · May 4, 2023, 10:26am

I hear you.

I hesitate to switch to conda entirely because I am already familiar with pip, wheels, build backends, etc., while I am much less familiar with the conda toolchain, although I did learn about it a bit while trying to build these wheels (as the branch I’m working on uses conda in CI to get poppler-qt5 from conda-forge before building the wheels…).

Apart from the desktop file installation problem, are there things conda does less well than the PyPA stack? For example, are there tox-like test automation tools? Do tools like PyInstaller and py2app work well with packages installed from conda?

oscarbenjamin · May 4, 2023, 11:52am

Yes, that would be nice. While it is valid to consider alternatives for now like conda or other packaging systems I think this is still something that ideally should be improved. At least a simple PyPI package could be made that helps to make this work so that projects like zmq can share a mostly working approach. The fact that different approaches are needed for pip vs conda or CPython vs pypy suggests that ultimately at least part of a full solution for this requires some provision by the base install of Python and the packaging system being used to specify how finding shared libraries is supposed to work.