Identifying & parsing binary extension filenames

For a couple projects I’m working on, I need to be able to identify binary extension files and extract the names of the modules they provide. importlib.machinery.EXTENSION_SUFFIXES only gives me what I need for the current Python and the machine it’s running on, whereas I need to work with extension files for any Python. There doesn’t seem to be a library for this already, so I’ve had to investigate this on my own, and I’ve come to you to double-check my findings.

What I’ve determined so far:

  • Linux/manylinux binary extension modules have names of the form {module}.{implementation}-{abi}-{arch}-linux-gnu.so (Can the gnu part vary, or is that attached to linux?). Example: foo.cpython-38-x86_64-linux-gnu.so

  • macOS binary extension modules have names of the form {module}.{implementation}-{abi}-darwin.so

  • Windows binary extension modules have names of the form {module}.{impl_abbrev}{abi}-{win}.pyd where win is win_amd64 or win32 (or other values?). Example: foo.cp38-win_amd64.pyd

  • Certain Linux & macOS extension modules (ones built for Python 2?) have names simply of the form {module}.so.

Is there anything important I’ve missed? Would r'(?:\.[-A-Za-z0-9_]+\.(?:pyd|so)|\.so)\Z' be an appropriate regex for matching any & all binary extension module file extensions?

1 Like

Further research has turned up module names of the form foo.abi3.so (for both macOS and Linux) and foo.pyd. Is there any place that all of this is documented?

I think these are all implementation defined and we never actually had a spec.

@pitrou, did we have a spec last time we played with these? Or just a debate somewhere on the issue tracker?

There’s PEP 3149 for the POSIX case, though the convention outlined there doesn’t match reality: it lacks the {platform} part, e.g. x86_64-linux-gnu. The Windows case does not have a PEP or mailing-list discussion AFAIK.

The current state was done in https://bugs.python.org/issue22980 (changesets 03a144bb6ac3d7631a3bdb895e2a1f2d021fb08b, d3899c1a962f4f06f52199d1e5e4b921843e587b and 3b8124884c3655b4cf2629d741b18c1a38181805).