Embedded python module loading

I have a c++ application which is running python scripts in embedded python. The python script that is being executed uses 3rd party- as well as built-in modules.

Some 3rd party module use the file attribute of the built-in abc module of python. But whenever it does it gives the following error: AttributeError: Module abc does not have file attribute.

Looking at the python documentation of https://docs.python.org/3/library/importlib.html:

__file__
The location the loader used to load the module. For example, for modules loaded from a .py file this is the filename. It is not set on all modules (e.g. built-in modules).

So it makes sense that the abc module has no file atrribute (since it is a built-in module). But then the 3rd party module would never work since it relies on the file attribute of the abc module.

I tried running the python script from normal (non embedded) python. And then it runs fine. It is also able to locate the __file__attribute of the abc module. So i was wondering what is going wrong with my embedded python and why the abc module does have a file attribute in ‘normal’ python when the documentation says it should not have this attribute.

I’m on Ubuntu 22.04.3 and build python3.11.2 from source code with configure options:
–enable-optimizations --enable-shared --prefix=“Absolute/Path/To/buildShared”

Here is part of my c++ code for initialzing the python interpreter:

Py_SetPythonHome(L"cpython-3.11.2/buildShared");
Py_SetProgramName(L"cpython-3.11.2/buildShared/bin/python3");

Py_SetPath(L"cpython-3.11.2/buildShared:cpython-3.11.2/buildShared/lib:
cpython-3.11.2/buildShared/lib/python3.11/lib-dynload:
cpython-3.11.2/buildShared/bin:
cpython-3.11.2/buildShared/lib/python3.11/site-packages:
cpython-3.11.2/buildShared/lib/python3.11");

PyConfig_InitIsolatedConfig(&config);
config.isolated = 1;
config.use_environment = 0;
Py_InitializeFromConfig(&config);

i installed all 3rd party modules to site-packages folder. Embedded python is able to locate them and set the file attribute correctly.

When i run non embedded python the file attribute of the abc module is:

cpython-3.11.2/buildShared/lib/python3.11/abc.py

And this is the python test script i am using:

import abc

if hasattr(abc, '__file__'):
    print('abc module __file__ attribute: ', abc.__file__)
else:
    print('Module abc does not have __file__ attribute.')

Then the third party module needs to be fixed. What does it use abc.__file__ for? If fixing the 3rd party module isn’t an option, you could try manually setting abc.__file__ before importing that module.

I was thinking about this too. But still it is strange that when the python script is imported in a non embedded environment, abc module does have a file attribute and everything runs fine.

I feel like setting the file attribute of abc manually does not fix the underlying problem and might give problems in the future when further developing my program.

Well, yes. abc.py is a file in the stdlib normally . I guess whatever embedded version you are using is hard coding the stdlib, or at least some central parts required for it. The 3rd party library is just wrong for relying on it. Which is why the central question you should be asking is what it is using that path for?

There definitely is a abc.py in the standard library. The abc standard library module is “frozen” in the most recent versions of Python (although e.g. in 3.8 it’s a perfectly ordinary Python module), but not “built-in” (meaning: there is no corresponding Python code at all, not even a wrapper for the internal implementation details). sys is a good example of a built-in module. Built-in modules get special treatment; in particular, importing them bypasses sys.path using the default import scheme.

When I try it, abc certainly does have a __file__ attribute:

$ python3.8
Python 3.8.10 (default, Nov 22 2023, 10:22:35) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import abc
>>> abc
<module 'abc' from '/usr/lib/python3.8/abc.py'>
>>> abc.__file__
'/usr/lib/python3.8/abc.py'
>>> 
$ python3.11
Python 3.11.2 (main, Apr  5 2023, 03:08:14) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import abc
>>> abc
<module 'abc' (frozen)>
>>> abc.__file__
'/usr/local/lib/python3.11/abc.py'

The documentation doesn’t say any such thing. Standard library modules are not inherently “built-in”. Here, “built-in” means “completely implemented in C and compiled and linked into the interpreter itself”.

But your embedded Python might implement abc that way for technical reasons. I’ve only tried building Python as a normal application, without --enable-shared.

You’d need to understand the third-party library code, at least well enough to understand why it’s trying to do this. But even the workaround would be tricky. You’d need to make sure abc is always imported (and its __file__ hacked) before the third-party code.

People writing these libraries generally assume the code will run on a “normal” platform, and don’t have the means to test on every platform out there. “abc will have a __file__ because I see abc.py in the GitHub repository for Python” is not an unreasonable assumption. Neither is “… because that’s the normal way modules work, and also I tested it on my platform”, frankly.

I tried to look into the 3rd party module to see why it needs the file attribute, but the module is very large and it would take a lot of time changing the code so it is not using the file attribute.

I added the file attribute manually now and it gave some more AttributeErrors of other standard modules that it could not find the file attribute for. After also manually adding those, the errors are gone!

I’m glad it’s working now but still, if any python wizard is reading this and could explain why it does not set the file attribute in embedded python by default (i assume maybe something in the Py_Initialize() function) i would be very interested.

The reason it doesn’t set the __file__ attribute is because those files don’t necessarily exists. The modules are baked into the embedded python and are not actually accessible as source code.

1 Like

As I said:

If they’re implemented as built-in, they won’t have a __file__. For example, using sys which is ordinarily built-in:

$ python
Python 3.8.10 (default, Nov 22 2023, 10:22:35) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys
<module 'sys' (built-in)>
>>> sys.__file__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'sys' has no attribute '__file__'

Otherwise, I don’t understand what you mean by “why”.

Thanks, i understand now that embedded python sees these modules as built-in and therefore does not set the file attribute.