How specify a lib dependency

I’m using a pyproject.toml file to specify the depencies. I have multiple packages within my project. Each package has it’s on pyproject.toml. So, something like a workspace in the Rust world.

📂 .
├──  lib
│   ├── 📂 luci
│   │   ├── 📂 core
│   │   ├──  pyproject.toml
│   │   ├── 📂 tests
│   │   └── 📂 tnc_connector_core.egg-info
│   └── 📂 luci-postgres
│       ├── 📂 postgres
│       ├──  pyproject.toml
│       ├── 📂 tests
│       └── 📂 tnc_connector_postgres.egg-info

The postgres package depends on the core package. Is there a way to specify the dependency in the pyproject.toml file?

There are Tutorials on packages and depencies
Tutorials

That talks about a pip file
pipfile

Now you say: Each package has it’s on pyproject.toml
What I see instead is just 1 such file for a big project. I’m not sure if that than would interact with the several pyproject.toml for each project underneath like you want it.

I could think of creating an entire new pyproject.toml covering all the dependencies you need. But that is a pain and can easily cause mistakes. May be a bit of scripting to collect dependencies from each single pyproject.toml and write them into a new file could be an idea. Then copy over the outcome into the pyproject.toml file for the greater project now.

PEP 621 specifies this stuff. You want the dependencies section of the [project] table:

Note that in the TOML file, the project is your package.

As an example, the pyproject.toml for my cs.x package starts like
this:

 [project]
 name = "cs.x"
 description = "X(), for low level debugging."
 authors = [
     { name = "Cameron Simpson", email = "cs@cskk.id.au" },
 ]
 keywords = [
     "python2",
     "python3",
 ]
 dependencies = [
     "cs.ansi_colour>=20220227",
     "cs.gimmicks>=20230331",
 ]
 classifiers = [
     "Programming Language :: Python",
     "Programming Language :: Python :: 2",
     "Programming Language :: Python :: 3",
     "Development Status :: 4 - Beta",
     "Intended Audience :: Developers",
     "Operating System :: OS Independent",
     "Topic :: Software Development :: Libraries :: Python Modules",
     "License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)",
 ]
 version = "20231129"

By contrast, I keep my personal modules/packages in a single monorepo, much like the OP is above. It’s reasonable to have a pyproject.toml for each package if that’s how you’re working.

o.k. Postgres depends on Core
writing project.toml

When on the page scroll down to:

Dependencies and requirements

dependencies/optional-dependencies

If your project has dependencies, list them like this:

[project] dependencies = [ “httpx”, “gidgethub[httpx]>4.0.0”, “django>2.1; os_name != ‘nt’”, “django>2.0; os_name == ‘nt’”, ]

See Dependency specifiers for the full syntax you can use to constrain versions.

You may want to make some of your dependencies optional, if they are only needed for a specific feature of your package. In that case, put them in optional-dependencies.

[project.optional-dependencies] gui = [“PyQt5”] cli = [ “rich”, “click”, ]

Each of the keys defines a “packaging extra”. In the example above, one could use, e.g., pip install your-project-name[gui] to install your project with GUI support, adding the PyQt5 dependency.

requires-python

This lets you declare the minimum version of Python that you support [1].

[project] requires-python = “>= 3.8”

Thanks for showing the structure of your project; its very helpful in giving you advice specific to your situation. One key point to get out of the way first, in case you (or others) aren’t already aware, is the overloading of the term “package” to mean two entirely different things “import package” and “distribution package” [1]:

  • An import package is any directory containing an __init__.py file (and typically, Python code), and the name of that import package directory is what you import in Python.

  • A distribution package is how your project is distributed to others (on PyPI, etc), and may contain one or more import packages. Each pyproject.toml corresponds to one distribution package, and the project.name specified therein is the name under which you find the project on PyPI, install it with pip, specify it as a dependency, etc.

In your case, you have two top-level import packages, core and postgres. Each top-level import package is contained within its own distribution packages, with respective names tnc-connector-core and tnc-connector-postgres (I can see the names above due to using legacy editable installs with .egg-info files). core and postgres would be the names you’d import them under in Python, while tnc-connector-core and tnc-connector-postgres would be how you’d find and install them from PyPI, and how you’d specify them as dependencies.

As others have mentioned, assuming you’re using a current version of any modern backend that supports pyproject (“PEP 621”) metadata (Setuptools, Flit, Hatch, PDM, Meson-Python, Scikit-Build-Core, etc, basically everything but Poetry) you can specify your project’s dependencies in pyproject.toml following the standard specification, like this:

[project]
name = "tnc_connector_postgres"
version = 0.2
dependencies = ["tnc_connector_core>=0.2"]

The deps specified in pyproject.toml are, in general, abstract dependencies; they state what distribution package names and versions your project requires in order to function. However, what you might actually be asking is for a way to tell packaging tools to build and install your dependency locally, rather than from PyPI by default. This corresponds to concrete dependencies, i.e. specifying the exact dependency versions and artifacts you want installed, and where to get them from. For that, you’ll probably want to use a requirements file or lock file for this purpose, which also lets you specify an editable install. Support for relative local paths appears to be patchy at the moment, but you should be able to get it to work.


Sidenote: To note, giving the import packages such generic names if they aren’t going to be under another top-level import/namespace package (like luci) isn’t usually the best idea, since they will clash with any other import package that might happen to have such a generic name (e.g. e.g. a postgres binding library of that name, or core for another project you’ve created).

The simplest solution is to incorperate luci, tnc, etc. in the import package (directory) name, e.g. luci_core or tnc_postgres. Alternatively, you could make a top-level luci namespace package, so you’d have one top-level import package (luci, with your subpackages accessed by luci.core, luci.postgres, etc.)


  1. Note that I ignore namespace packages here as you don’t appear to be using them, they aren’t that common, and further complicate several of the points below ↩︎

2 Likes

To note, the OP is asking about specifying dependencies in their pyproject.toml files; pipfile is a file format specific to the legacy, no longer widely used pipenv packaging tool, and is not likely to be useful unless the OP already happens to be using that particular tool. However, the general approach of specifying local installation in a concrete dependency file is indeed what I’d suggest if the OP is actually looking for a way to programmatically specify editable installation of local packages.

I’m not really sure how this would really help, nor would be necessary, sorry—if the OP wants to specify dependencies for a specific distribution package, they merely use the project.dependencies key in that project’s pyproject.toml, while if they want an environment with all their top-level packages installed, a simple requirements.txt listing the top-level distribution package names will suffice, or if they do want an actual metapackage with a pyproject.toml, they only need list their top-level packages, in this case just tnc_connector_core and tnc_connector_postgres (and technically, the former could be omitted because its a dependency of tnc_connector_prostgres), which will produce the same result and is relatively easy to maintain manually.

AFAIU, the OP is looking for an equivalent of Cargo’s workspace feature, which allows you to develop several packages at the same time in a mono-repo. The virtual environment used for development needs to have all the projects installed as editable, but the packages uploaded to PyPI also need to be installable. Cargo achieves this by rewriting metadata upon publishing; this would be an equivalent of rewriting foo @ file://foo/ into foo in the pyproject.toml file. AFAIK, none of the current workflow tools (Hatch, PDM, Poetry, Flit) supports this. So the only solution is to specify dependencies normally, with only the name of each package, and in order to develop the package, create the virtual environment manually, using python -m venv; venv/bin/pip install --editable subpackage1/ --editable subpackage2/ instead of relying on hatch shell, poetry update or the like.

Yes; the same way that you would specify third-party dependencies - by using the project.dependencies table.

Once you have made the choice to use a separate pyproject.toml per package (and thus, to distribute them separately), Python loses any reason to care about the containing lib folder (or any directories higher up in the tree).

Alternately, you can make a PyPI package based off a single pyproject.toml in the lib root, and configure it so that pip install for that single name on PyPI installs all the code. IIRC, this can be done either such that the installed code has a corresponding containing folder (client code will see a single package with that name, and need to account for it in absolute imports; your code, if it uses relative imports, should work with the installed folder structure rather than the development folder structure) or as separate deliverables inside site-packages (client code will have two separate top-level absolute imports available; your code will still need to use absolute imports to access internal dependencies).

IMO, we should have just called “import packages” modules from the start (because they are), and referred to any modules contained within another module as “submodules”. That is, treat “this thing, despite being a coherent unit in itself, is part of a larger whole” as a special property of certain modules, rather than treating “this thing has components of the same kind” that way. Then “package” would only mean distribution packages, which would avoid tons of confusion (since, among many other things, “packaging” never seems to mean creating import packages).

1 Like

I agree that in hindsight something like that would probably have been a lot better (or perhaps still calling (file) modules modules, and calling “import packages” module directories, directory modules, or basically anything other than packages, e.g. namespaces). At least point, the meaning of module as a single importable Python file is embedded enough into the language that I’m not sure its feasible to change the meaning, but I’m not sure its out of the question to change what we now call [import] packages to something else—though that would be a whole separate discussion, of course.

It is important to keep in mind though that if we’re really talking “from the start”, back when import packages were first introduced (I can find them at least as far back as Python 1.5, dating from early 1998, over 25 years ago), there wasn’t really any other “packaging” as we know it now; what we now call packaging didn’t really exist in general beyond mostly just copying files and directories (i.e. [import] packages) around; distutils, and thus the very bare rudiments of distribution packages, wasn’t added until almost 4 years later, with Python 2.2 at the very end of 2001, and PyPI wasn’t created until over 5 years later, in 2003 (and of course, it took many years after that for [distribution] packaging to really start to take off).

2 Likes

Great you’re coming around on this :wink:

Thanks a lot for your detailed answer

Hello,

I am reading Learning Python, 5th Ed. (2013). There, it states the following (Chapter 24: Module Packages):

As one consequence of this new import procedure, as of Python 3.3 packages no longer
require __init__.py files—when a single-directory package does not have this file, it will
be treated as a single-directory namespace package, and no warning will be issued. This
is a major relaxation of prior rules, but a commonly requested change; many packages
require no initialization code, and it seemed extraneous to have to create an empty
initialization file in such cases. This is finally no longer required as of 3.3.

Has this requirement been brought back?

This book is giving bad advice, in my opinion. Having no __init__.py file is indeed allowed, but it will turn the package into a namespace package, which can have surprising consequences (a namespace package is a package that can be split across the file system, i.e., you can have different subpackages installed in completely different directories). Always use an __init__.py file (even if empty) unless you specifically intend to create a namespace package.

3 Likes

I address this in both my footnote prior to the definitions:

And also in the sidenote at the end:

As I mention there, while I considered describing the caveat in the main text, I didn’t want to further overcomplicate the distinction, and so moved it to a footnote (which I tweaked to fix a syntax issue and be more visible).

As @jeanas mentions, the book is simply wrong, and quite badly so. Nowhere in PEP 420 that introduced this to enable marking namespace packages was there even any mention of this very minor convenience factor having anything to do with the change. And furthermore, an __init__.py is not “extraneous”, as the book speciously claims, but is an unambiguous marker of an otherwise ordinary directory being a Python package, avoiding a number of packaging, tooling and user issues and complexities that arise when this has to be implicitly guessed.

3 Likes

Thank you for responding and clearing this up. I will update my notes. I will make sure to include an __init__.py file in my packages.

Much appreciated.

That’s correct. I don’t want to have to publish to PyPl in order to use the package as a dependency. I want a way to override the default to point to a local package to find the dependency. I also understand that would require I build the packages in a sequence that tracks the dependencies (so, build core first).

Thank you for the concrete example. Prior to my posting I did read through the PEP docs. I couldn’t find an example of how to reference a “package” dependency found locally. Similarly for the spec that describes a dependency. Despite this, I was hoping there was a way that I had overlooked.

There have been a lot of great comments. So far I’m concluding that there is no way to use pyproject.toml to specify a locally hosted package. There might be a way using the requirements.txt file. This would involve building a whl file for core (I renamed it to luci-core per suggestions made elsewhere), and referencing that in the requirements.txt file for the postgres package (I renamed that to luci-postgres).

So it looks like we can’t get rid of using requirements.txt in all cases when using pyproject.toml.