Best practice suggestions (including bootstrap)

A co-developer pointed out to me that I was missing some important information:
pkgsrc (www.pkgsrc.org) packages all kinds of software in all kinds of languages. So python modules are just a subset of what we want to package.
pkgsrc bootstraps from a C compiler and also installs python itself (and ruby, rust, perl, …) and provides binary packages for users to install.
We currently have some 3000 python modules packaged this way, mostly using setup.py.

Your “pkgsrc binary packages” are presumably some sort of archive of files that the pkgsrc installer will copy into the ultimate desired location on the user’s system?

Wheels are similar, in the sense that they are an archive of the files that need to be on the user’s system in order for the user to make use of the package. So I’d argue that you should build wheels (using a modern Python tool like build) and then copy the content of those wheels into your pkgsrc binary packages. The wheel spec explains how to extract the files onto the target system - you can adapt that to decide how to extract into your archive format and subsequently how to extract the archive onto the user’s system.

That’s basically the “modern python best practice” for this type of workflow. If you want help with the practicalities of implementing it, there are people here who work on Linux distros, who would have more useful insights than I can offer (I’m a Windows user) - but the above is the basic idea.

Thanks for the reply, @pf_moore .
I took a look at wheels, and I think they are not the right solution for what I want:

  • Only Linux/macOS/Windows wheels can be uploaded to pypi. This means that even if I use the pure python wheels from pypi, I still need to find a way to build the ones that are operating system specific ones (since this leaves out NetBSD, FreeBSD, OpenBSD, Solaris, …).
  • Using wheels means using packages someone else created. I’d prefer to use the sources.
  • Using wheels means I can’t run the tests, which I like to do before updating a module in pkgsrc to verify it still works.
    In summary, wheels look like a binary package format - but pkgsrc already has one. I prefer (and for some packages, have to due to not using Linux/macOS/Windows) to build from sources.

[…]

In summary, wheels look like a binary package format - but pkgsrc
already has one. I prefer (and for some packages, have to due to
not using Linux/macOS/Windows) to build from sources.

I think there’s some confusion still. You don’t need to use existing
wheels. The idea is that instead of calling setup.py install to
compile and place files into the tree for your pkgsrc package, you
use the build utility to build a wheel from the sdist tarball and
then copy the files from the wheel into the equivalent path in your
package. The only functional difference between setup.py install
and build/wheel is that the former places the resulting files into a
filesystem tree while the latter places them into a zipball (which
can then be extracted to achieve the same results).

1 Like

Thanks, @fungi . Yes, that would be possible.
Which still gets me back to my original question, what the intended workflow is, starting from a basic python installation. I mentioned the one that pkgsrc is currently using in my first post.

Yep, I understand. You’re trying to repackage Python projects for pkgsrc.

True, but CPython ships with ensurepip which installs pip as part of installation. Basically it’s assumed (if you’re not on Debian :wink:) that if you have CPython installed that you will also have pip installed.

I think I understand where the confusion is. The reason we all keep saying “use wheels” isn’t for you to directly install wheels, but to build your own wheels and tear them apart to repackage them for pkgsrc. The thing is you can’t get what you’re after from straight sources (what we call a “source distribution” or “sdist”). An sdist only contains a configuration file (pyproject.toml for up-to-date projects, setup.py for older ones), and that’s it. You actually can’t deduce what files should go where on the file system by introspecting an sdist. You must build a wheel to get that information.

With my understanding that pkgsrc packages are binary artifacts and not a collection of source that you build upon install, what people are suggesting you do is:

  1. Download the sdist/source.
  2. Build a wheel.
  3. Tear the wheel apart based on its metadata and repackage its files using the pkgsrc format.

The bootstrap project that @FFY00 pointed you at seems to do the bootstrapping you’re after to get you the parts you need to build the wheel for you to then disect and reassemble into a pkgsrc package.

BTW, I am not sure if anyone has proposed to allow BSD-based wheels to be uploaded to PyPI. It would require specifying how to version them appropriately (e.g. the manylinux spec), teach Python how to detect the OS and version appropriately, etc., but I don’t think there’s a fundamental reason beyond lack of time and motivation why it couldn’t happen.

2 Likes

Using a wheel as an intermediate step may be fine, but what is the actual procedure for building a wheel and installing it into a destdir?

Before, starting with an sdist for foo-1.23, the installation procedure that worked reliably for almost all Python packages was more or less:

cd foo-1.23
python setup.py build
python setup.py install --root=${DESTDIR}

What’s the sequence of commands that one is supposed to execute in the new world order where python setup.py install doesn’t work, starting from an sdist, to get the same effect?

And, if the new procedure depends on having foobuilder or bartools or bazutils outside the cpython-3.x distribution, how is one supposed get those built from sdists first?

1 Like
python -m build --outdir ${DESTDIR} .

https://pypa-build.readthedocs.io/

@FFY00 pointed to their python-bootstrap project in Best practice suggestions (including bootstrap) - #5 by FFY00 . Otherwise wheels are zip files, so you can unpack them and copy files to the appropriate places.

build is not part of python, so the first command will not work.

I tried using python-bootstrap again with python 3.9.9 and nothing else installed and get:

# python3.9 -m bootstrap.build                                                                                                                                                     
Traceback (most recent call last):
  File "/usr/pkg/lib/python3.9/runpy.py", line 188, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/pkg/lib/python3.9/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/tmp/python-bootstrap/bootstrap/__init__.py", line 88, in <module>
    import build.env  # noqa: E402
  File "/tmp/python-bootstrap/.bootstrap/modules/build/env.py", line 18, in <module>
    import packaging.requirements
ModuleNotFoundError: No module named 'packaging'

So we’re still missing step 0, get a build tool that builds the first wheels for the tools I need to build everything else.

Next time, please open an issue on the repo if you can. There aren’t any tests yet, so I missed that, it should be fixed now.

Thanks! Just to be clear, does this have the same effect as building a wheel, and then installing that wheel into ${DESTDIR}?

Would it also have the same effect to build into dist/ (as python -m build . with no --outdir ${DESTDIR} does by default), and then copy that into ${DESTDIR}?

# build stage
cd foo-1.32
python -m build .

# install stage
cd foo-1.23/dist
pax -rw -pe . ${DESTDIR}

(adjusted, perhaps, for appropriate subdirectories under foo-1.23/dist and/or ${DESTDIR}, like maybe pax -rw -pe . ${DESTDIR}${PREFIX}/lib/python3.X/site-packages)

Or does running python -m build --outdir ${DESTDIR} . or installing a wheel also do other things for installation?

Does it make a difference whether foo-1.23 is, e.g., the content of a git tag from a typical project, versus an sdist created with python -m build sdist or an sdist downloaded from pypi?

(I read the documentation at https://pypa-build.readthedocs.io/ but it didn’t clearly answer these questions. Apologies if I haven’t found documentation that should be obvious; I don’t do much Python packaging myself, and it has been several years since I worked with this stuff more directly.)

I’m not a packaging expert like some here, but as reading the source code and the docs confirms, all --outdir does is exactly what it says, changes the output directory of the built distribution packages from the default ./dist to one of your choosing, which you presumably want to set to whatever you want to store the output wheel for you to then repackage it. It doesn’t do anything additional, e.g. actually install the wheel. For that, you need installer as linked above, or unpack the wheel and move the .data files into the appropriate locations under the prefix, etc.

I’ve never personally tried running build in an unpacked sdist, rather than just on the source tree, but so long as the MANFIEST.in, package_data and data_files are in sync, it should produce identical outputs? But I’d defer to the experts on that one.

build does just that by default, packages your project into an sdist, extracts it and builds a wheel from the extracted sdist. This helps make sure your sdists are sound. As for whether you’ll get the same sdist or wheel out of the source code and the sdist, that depends entirely on your backend.

OK, so I asked what the equivalent of this is:

cd foo-1.23
python setup.py build
python setup.py install --root=${DESTDIR}

It sounds like python -m build --outdir ${DESTDIR} . is not, in fact, the equivalent of that at all, because it just creates an sdist and wheel archive in the outdir, and unlike the python setup.py install step it doesn’t install the content anywhere.

So what is the equivalent? Maybe something like this?

cd foo-1.23
python -m build
python -m installer --destdir=${DESTDIR} dist/foo-1.23-py3-none-any.whl

Is the command-line interface for python -m installer documented anywhere? I skimmed through https://installer.readthedocs.io/ but I don’t see anything about the command-line interface, and every page in the API reference has a warning on it that the API is not finalized, so I’m not sure what I can rely on here.

I’m sure I can cobble something together that works in a few cases I can test, by enough trial and error and reading the source, but I want to be sure I understand what the Python community actively intends to work reliably here, like python setup.py install did for many years (and, in most Python packages, still does).

installer doesn’t have a CLI (yet). There’s no equivalent, but at least some people’s intention is to be able to invoke installer from the command line to perform the installation, yes.

The intended usage is pip install, but you have said that doesn’t meet your needs (which is fine). So you’ll have to be prepared to do a certain amount of the work yourself right now. But ultimately, either installer will gain a CLI, or you will be able to invoke its API from a small wrapper script you write yourself[1].

One possible confusion here is that the old setup.py install approach was rooted in the idea that everything used one tool, setuptools, to do builds. Modern packaging is working very hard to provide choice for package authors and users, by standardising interfaces rather than tools. So there’s not necessarily one “correct” way to do things, but rather there are tools for different use cases. Currently, a lot of resource is focused on “end user” tools like pip, and build backends like setuptools and flit, because those are what people are used to using, but lower level tools for system integration type tasks, such as build and installer, are being developed as well.


  1. You can do that now. I don’t think you should be too concerned that the installer API is not finalised. It’s probably not going to change a lot, and changing your script wouldn’t be that hard anyway. ↩︎

1 Like

Other than it still being too early in development to have spent a modest amount of time working on at least a basic one, is there any rational reason why installer would not gain a CLI in the near future, or at the very least by the time it is production ready? I seem to recall there being an open issue/PR to do just that, in fact (by Filipe, IIRC)?

I’m not an installer developer, so I have no idea, TBH. But I thought it was intended as mostly a library, so while a CLI would be a convenience for users with simple needs, it would likely leave more complex uses[1] to be handled by a dedicated script calling the library’s API.

I’d expect at some point that py -m installer some.whl will work. But I don’t know if a --prefix or --destdir command line option would be considered reasonable to include (install schemes are complicated beasts…)


  1. Which, judging by the length of this thread, I suspect the OP’s needs would be described as. ↩︎

So this is indeed confusing.

In the past (and present, but maybe(?) not near future), the interface that reliably worked for almost all Python packages was python setup.py build && python setup.py install --root=${DESTDIR} (whether or not setup.py uses import setuptools under the hood as a tool).

But what I’m hearing here is that we’re supposed to use the specific tools build and installer—well, except there’s apparently two different variants of installer, one from FFY00 (part of python-bootstrap) which has a CLI and one from pradyunsg which doesn’t, as I learned today. And I’m still not sure how we’re supposed to use them, i.e., what the intended interface is.

I think our goals are fairly simple.

Background:

  • If the Python program foo-1.23 is to be used by import foo, Python will look in some path baked into the python executable like /usr/pkg/lib/python3.11/site-packages/foo to resolve the import, and in turn the code in foo might look for data files under there.
  • If it has a command-line entry point, there’s a script at /usr/pkg/bin/foo with a #! line for sys.executable.
  • (In this example, Python was built with sys.prefix = '/usr/pkg'—point is, it’s baked into the underlying python executable, not a parameter that’s relevant here.)

So I’m looking for a way, given the source tree for foo-1.23, to:

  1. build any bytecode or entry point scripts or supporting data files from source code, and then
  2. put all those files in their places, relative to a staging root directory like /tmp/work/foo-1.23/destdir so the build process itself doesn’t require privileges or interfere with the building environment.

Pretty much all general-purpose package managers—pkgsrc, FreeBSD ports, Debian, Fedora, Homebrew, &c.—have the same goal here, so I would hope there’s an easy way we can all do it!

(Tools that do more than this like pip will do essentially this as a subroutine. But getting pip not to do additional things like talk to the network, consult its own index, cache things, do its own dependency resolution, &c., takes a bit of work and I’m not sure all the measures we take in pkgsrc to avoid all that are reliable.)

Hiya!

installer has a working Python API which, depending on your use case, could be sufficient. The main reason for the stability warnings in the API is that there’s a chance that there might be some stupid bug in there that’ll need a backwards incompatible change. :slightly_smiling_face:

FWIW, the only blocker to switching pip over to using that package is (a) someone finding time for it and (b) deciding on how to manage the behaviour difference of what happens when a file-to-write already exists.

If you need a CLI on top of it, it should be possible to write one today. As mentioned Add CLI by FFY00 · Pull Request #66 · pradyunsg/installer · GitHub is the open PR to add a CLI directly to the package, which has gotten a decent amount of feedback and has an ongoing discussion right now. There’s also a link to a package that provides a CLI based on installer too in there. :slight_smile: