Setup.py: installing a requirement without its dependencies

I’m the author of a package called “my_pkg”, which depends on another package called “dep_pkg”. However, dep_pkg has its own dependency called “huge_pkg”, which is big and complicated to install. In fact, dep_pkg can run fine without huge_pkg.

How can I make it so that running “pip install my_pkg” installs dep_pkg but not huge_pkg?

The authors of dep_pkg are not willing to put huge_pkg in extras_require (because they want it to be installed by default), but they would agree to put an opt-out via env var in setup.py:

if not os.getenv('SKIP_HUGE_PKG'):
  install_requires.append('huge_pkg')

My idea was that in my_pkg’s setup.py, I could set the env var before calling setup():

os.environ['SKIP_HUGE_PKG'] = '1'
setup(
    name='my_pkg',
    install_requires=['dep_pkg'],
)

According to my testing, sometimes this works, sometimes not. It looks like each package’s setup.py command is launched in a subprocess via pip’s call_subprocess, so env vars don’t carry over.

Any ideas how I can make this work? I know I could tell my users to set SKIP_HUGE_PKG in their shell before doing pip install my_pkg, but that has its own drawbacks.

Or is there a way other than using env vars? Like passing some flag in install_requires like no-deps to install my requirements without their dependencies?

Thank you!

1 Like

There are many things that would probably work but no simple, straightforward technique comes to mind.

Maybe you could consider publishing your own huge_pkg, on an alternative index. This huge_pkg of yours would be empty (i.e. a setup.py with an empty list of packages packages=[]), and would have a version number that satisfies dep_pkg's dependency requirement. Your users would need to install your huge_pkg first (probably with --index-url), then your my_pkg.

So this is one technique I can think of right now. There are probably others…

1 Like

This is essentially my use case from Adding a default extra_require environment:

I’m not sure if that thread was really winding around to some conclusion, but it seems to be a common enough desire that some kind of solution would be worth finding.

1 Like

The current best practice to the use case is to create two packages, say my-pkg and my-pkg-core. The “core” package contains the actual code most of the metadata, and can be used if huge-pkg is not wanted, while the other contains no code and declares dependency on my-pkg-core and huge-pkg. For each new version, you’d make one release of each package, with my-pkg depending on the exact same version of my-pkg. This sounds complicated, but it would make this a tooling issue—which anyone can fix—instead of a specification one, which would require a lot of delibration and enough enthuthism (which no-one has yet to express on this topic) to push the discussion to the end.

3 Likes

Thanks all for the replies!

@sinoroc, your suggestion gave me the idea of bundling the fake package’s installer directly in my own package. I created a file ‘shims/huge_pkg/setup.py’ that just contains:

from setuptools import setup

setup(
    name="huge_pkg", version='3.0.1', packages=[],
)

Then in my package’s setup.py, I pass:

install_requires=[
  'huge_pkg @ ' + Path(__file__).parent.joinpath('shims/huge_pkg').as_uri(),
  'dep_pkg'
]

This appears to work on both Windows and Linux. When dep_pkg installs, it says Requirement already satisfied: huge_pkg.

1 Like

Oh… Not exactly what I had in mind, but if it works for you, give it a try. Honestly this is not a technique I would recommend, at least not without testing it thoroughly first. In particular I would be a bit worried when installing from wheels. Direct references in dependency requirements (the bit after the @), are not supported everywhere, as far as I can remember. PyPi would reject such distributions, if I am not mistaken. I guess, if you distribute only via git repository or something like that it might work. Build isolation might also get in the way though (relative paths would not point to the right thing anymore, for example).


Another question, your my_pkg really requires dep_pkg, or can it be made into an optional dependency requirement?

2 Likes

Thank you @sinoroc for mentioning these issues! I have learned a bit more about wheels and now I do see your point. Maybe distributing my_pkg as a tar.gz would work though. Anyway I guess I need to think about it some more.

As far as I know, installers (build front-ends), tend to build a wheel of every sdist (*.tar.gz) they install, so that they can reuse that wheel for subsequent installations. So if you rely on the execution of the setup.py at install-time, there is a good chance that it will be executed only on the first installation. Additionally, such builds usually happen in isolation nowadays, so the execution of the setup.py would happen outside of the target (virtual) environment. Although in some installers I think it is possible to deactivate the build isolation. I am not sure on all the details, so you can try it and it might be good enough for your needs, but I am not too optimistic about this kind of things as a long term solution.

There really isn’t any good, comfortable solution to your issue, that I can think of. All require some tricks, one way or another.

That is really too bad. Without knowing much about the actual libraries it is hard to make an opinion, but it looks like moving this huge_pk to an extra would be the right thing to do. Maybe they can be convinced to change their mind, @effigies pointed to a whole discussion on the topic and @uranusjr described a potential solution.

An OK solution to these problems is normally to introduce a third package.

Where you have “foo”, which depends on “foo-core”, and “foo-huge”, but is otherwise empty. Then people who don’t want “foo-huge” can just install “foo-core” directly, everyone else gets everything by default.