ImportError when partially packaging sympy

Hi friends,

I’m working with pyspark and trying to solve symbolic equations using the sympy lib. To use sympy in a distributed spark env, I need to package the dependencies into a .zip file and submit them with my script.

Initially, I attempted to package only the sympy and mpmath directories using the following command:

I installed the sympy with:

pip install --target=dependencies sympy 

and here is the structure of my dependencies folder:

├── mpmath/
├── mpmath-1.3.0.dist-info/
├── sympy/
├── sympy-1.13.3.dist-info/
├── __pycache__/
├── bin/
├── share/

Initially, I attempted to package only the sympy and mpmath directories using the following command:

zip -r ../ . sympy/* mpmath/*

However, when running my PySpark job, I encountered the following error:

ImportError: cannot import name 'make_mpc' from 'mpmath' (unknown location)

When I packaged the entire dependencies directory without excluding any files or folders, like this:

zip -r ../ .

The error disappeared, and everything worked fine.

I’m trying to understand why excluding only the sympy and mpmath directories caused the import error . The dependencies folder also contains other directories like mpmath-1.3.0.dist-info , sympy-1.13.3.dist-info , __pycache__ , and share . Could the missing metadata or some other files in these folders be necessary for sympy or mpmath to work correctly?

Any insights into what might be causing the issue when only sympy and mpmath are packaged, and why including all files resolves the problem? I’m particularly interested in understanding if files in dist-info , share , or bin directories are necessary for the library to function properly.

Thanks for your help!

Quick update: It’s a PySpark issue that I’ll further investigate. The package itself works well.