PEP 517 workflow for distributions

None of the projects is using src currently, so apparently it doesn’t seem to be that big of a deal just yet. Or did I give the maintainers an idea? :slight_smile:

FIne, not being able to use the package name as folder directory would mean explicitly listing the packages you need along with their corresponding module names in order to make a PYTHONPATH. Unfortunate, but really not that big of a deal. I suppose most distro’s do that already anyway.

The PyPA is a community and the distro maintainers can be a part of it, or at least that is what I imagine it to be. In that sense, are you not also a part of it?

This I unfortunately agree with. Following python-dev, Packaging/ here and interacting in some of the discussions as well as on the issue trackers of some of the projects, it is also my experience that the use cases and issues of distro’s are just not considered well.

Distro’s have issues like bootstrapping, they deal with large amounts of packages so they want or need to have a structured approach for fetching package information, and building and testing these packages. We need to resolve dependencies of a large set of packages when updating our package set, and be able to override requirements when solving, because there’s always packages out there that pin too strictly.

I’ll request everyone engaging in this discussion to avoid making broad strokes arguments/statements about state of Linux distros + Python packaging and who’s responsible for what — please go make dedicated threads for those discussions. Saying those things here is inevitably going to derail constructive discussion here, and I really don’t want that happening.

I’d like to not have this thread become about people complaining/debating about how status quo isn’t ideal (look around, nothing about the world is ideal at the moment). Let’s keep this discussion focused on the very specific suggestion in OP to resolve one specific issue.

Wanna have broader picture discussions / talk about responsibilities / anything that not 100% about the OP suggestion? Please make a new topic.

1 Like

My suggestion then for all the Linux distro folks is to organize and create a mailing list where you can coordinate what needs and issues you have to surface them to the appropriate group as a collective rather than as individual “Nix”, “Arch”, “Debian/Ubuntu”, or “Fedora/RH” asks. I’m sure the ML could be hosted on mail.python.org if you asked.

1 Like

Linux-sig already exists and is largely unused. Hijacking it for this would be totally fine, I’m sure.

2 Likes

https://www.google.com/search?q=“bdist_rpm”+“bdist_deb”+“conda-forge”&oq=“bdist_rpm”+“bdist_deb”+“conda-forge”


python setup.py bdist_rpm --help
python setup.py --command-packages=stdeb.command

fpm can also create a package out of a virtualenv path, but that’s not a universal wheel with MKL for this platform anyway.

1 Like

No worries. I can see that the post is written in such a way that you would probably need to read it consecutively to understand what is being proposed.

I consider this a large ecosystem-wide change.

the PyPA should only concerned with large, ecosystem-wide changes

The goal is to make bootstraping a packaging environment a reasonable task.

Then perhaps it should change its name, as the “Packaging Authority” is exactly whom I would contact asking for support in a change like this.

I think using “Authority” in the name, makes users feel like the PyPA is something they should be listening to follow guidelines from. As it stands, it appears to be just a group of projects. I do not believe this is correct as you can advertise to users to move to PEP 517 but when the ecosystem workflow breaks you get to say “it’s not our problem”.

I get that, but you must understand that is not the current status quo. Well, it’s something that is definitely changing, and I am very thankful for that, but it has not fully been achieved yet.

Pushing standards is a very big thing, but another big thing to me is interoperability of code. And this specifically is where I feel pip is privileged, why isn’t the pip install code a library that pip just uses? Or the build code? In pypa/build we had to essentially reimplement the same thing pip already does, the build isolation. Why isn’t the pip resolver a separate library? It just resolves a set of PEP 508 dependency strings, no?
Why aren’t these things that exists? Like pypa/pep517 or pypa/packaging.

We are in a state where the PyPA is pushing change, but mostly focus on implementing those changes for their tooling. While it’s great that there is a standard that allows other people to develop their own tools, by pushing change like this, the PyPA is putting a gigantic extra burden on other users of the ecosystem by forcing them to create their own tooling. Other users might not be very Python focused, some distributors might not even have access to someone well versed in Python, yet their are now forced to write their own tooling from scratch.
Do you understand this? I feel that most people here do not really understand what these ecosystem changes mean for other people. Especially when they performed in the way they are.
What I am asking here in this thread is to please let people catch up before start making change that rely on the new ecosystem changes.

This is also where I feel there is a disconnect between the PyPA and other Python package distributors. There is a point in specific which I think would bridge most of the differences if the PyPA started caring about it:

  • Bootstrapping a packaging environment from scratch (no vendoring)

I do understand that.

I think I can commit to that.

No, that solves my problem but leaves out people, which is specifically what I am asking the PyPA not to do.
We should have a ML for Python distributors, not for Linux distibutions, these issues are relevant for all distributors, not just Linux ones. There was already a discussion about this in Python-Dev, in the tzdata dependency thread, perhaps we should just go ahead and create the ML.

2 Likes

To me also.

Quite simply, because that’s not how pip was written. We’re slowly changing that, by defining standards, writing libraries (like packaging) that tools can use to follow those standards, and using those libraries in pip. Of course, that means that bootstrapping pip is a lot harder, so we have to vendor a whole bunch of stuff. That works fine on Windows, but conflicts with Linux distro policies. We work with the distros on devendoring, but we don’t maintain a devendored version of pip ourselves. :man_shrugging:

And I expect pip to vendor that code as soon as we can, so we’re not replicating it, but we use the same code as everyone else will. Of course, if pypa/build hasn’t implemented the mechanism in a reusable manner either (I think I recall hearing that the intention was to do so, but I may be wrong) we need to wait for someone else to do that so we can both depend on the library version…

It is - resolvelib. Again, pip just vendors it. There’s a lot of machinery around that library that’s still pip-sepcific, but that’s just because no-one has had the time to extract it into a library. (I’m dabbling in making pip’s finder into a library, but haven’t got very far, as it’s either trivial or incredibly complex, depending on where you draw the boundary… :slightly_frowning_face:)

Mostly lack of developer resource, as usual.

Yes - sort of. I understand that’s how people view the PyPA. But I’m painfully conscious that the reality is very different - whether we or they would like it not to be. I wish we’d never (jokingly) called ourselves an “authority”, and I don’t know how we address this disconnect. But I’m fine with other PyPA members working that out, and I’ll go along with whatever works.

That’s entirely distinct from my role as interoperability standards PEP-delegate, where I have a very strong view that we need to standardise as much as possible, and encourage the growth of reusable libraries and competing tools as much as we can. So I want to see libraries like packaging, resolvelib, importlib.metadata, etc flourish. I want to see more build backends, all competing on an even playing field. I want to see competitors to pip. I want to see alternative approaches that don’t require bundling a bunch of stuff in pip - but not at the cost of making build backends other than setuptools into “second class citizens”.

I’m extremely aware that we’re a long way from that goal, and that all the projects and groups involved are extremely limited in resources, so progress is slow. But we need to encourage progress even so, and not stagnate just because it’s too hard to get people to look at the future.

Sorry - I’ll get off my soapbox now.

This is also where I feel there is a disconnect between the PyPA and other Python package distributors.

There are disconnects all over the place. It’s almost impossible to get feedback from end users. We’re continually struggling with the problem that we give the impression that all we care about is packaging specialists - because we can’t work out how to find out what packaging users want. I don’t dismiss distributions - it’s just that I see end users as a far more difficult, and yet far more important problem, and I prefer to focus on that. Others may have different priorities, and I’m fine with that.

perhaps we should just go ahead and create the ML.

That sounds like a good idea. But someone specific needs to do that, so someone needs to commit to finding whoever can create a mailing list and make it happen. I’m not criticising anyone, but it’s awfully easy to end up in a situation where everyone is vaguely agreeing that something should be done, but no-one is doing it. (To be clear, I personally have no idea how to get a mailing list created, so I’m not going to be doing anything to make this happen).

Distros face the problem of having to bootstrap from source, this means we have to build the wheels and install them. pip does not have this problem, I think it should be fairly straight forward to bootstrap a devendored version of pip if you from the pre-built wheels. But we are getting off-topic…

It absolutely has implemented the mechanism in a reusable manner, that was one of the goals of the project. I did not want people to have to do the same as me and reimplement things from scratch.

There was some discussion on virtualenv with --system-site-packages breaks pip's build isolation · Issue #6264 · pypa/pip · GitHub, but that was not related to the reusability of the code, but rather if pypa/build had the same issue as pip or not. We have reworked the isolation environments since, and I am pretty sure we do not have that problem.

Sure, resolvelib does provide some resolution mechanics but it’s pip that implements the Python package resolution. That’s what I meant :sweat_smile:

But that is not what I am proposing here. I do not want other build backends to be second class citizens either.
My proposal here only targets the really small number of libraries required to bootstrap a Python packaging environment from source. It should have no effects on other backends.

Progress can still happen, nothing from my proposal is blocking it.

Well, it depends on your perspective. pip is mostly used in the Python ecosystem, distributions reach a far bigger audience, it’s just that you probably don’t hear much from those users as they are not involved in your circle or community.

Should be fairly straight forward :stuck_out_tongue:

From Mailing Lists | Python.org

To request a new list, send e-mail to postmaster @ python.org; please check first to make sure a similar list does not already exist.

Though, I probably shouldn’t be the one doing that. @brettcannon what do you think?

Wonderful! I will go back to the ‘packaging’ issue and say I support rolling back to setuptools.

I actually created GitHub - brettcannon/mousebender: Create reproducible installations for a virtual environment from a lock file specifically to help find library gaps in the whole installation process. My friend @d3r3kk and I got the Simple API done as a library (which we will upstream into ‘packaging’ once it drops Python 2 support), but have not gotten any farther as PEP 621 took up what non-‘packaging’ packaging bandwidth I have.

The README has an outline of the process with the relevant PEPs and projects (if one exists).

Anyone can request a list be created, I would just strongly advise on getting at least 2, it not 3, people lined up to be the admins for the list to make it easier (as I said, I don’t do MLs anymore so I unfortunately won’t be volunteering for this).

2 Likes

I can volunteer, but following your suggestion, it would be great to have 1 or 2 other people onboard.

This proposal sounds fine by us participating in the discussion if you ask us as a person. Note, though, that PyPa does not govern all packaging projects (and has no rights today to restrict in any way or form projects under it). Each of them is free to take on as dependency whatever they would want. So you’d need to make this proposal to each project in the case and convince them to abide by it. Or you’re proposing to change that and want to draw up a PEP that names core projects that must follow a set of stricter rules?

1 Like

A PEP would be tricky to do because we would have to lay down the projects we would enforce this on, which can change over time, and is just not a good solution.
I was hoping that we could come up to a consensus that this was probably the right thing to do and then politely ask the projects if they could consider doing it, but it seems that did not happen :confused:

People in this discussion from what I can tell mostly agreed on this, but you must recognise that the four of us do not own all those projects. So if you want agreement from other people you will need to reach out individually to each projects maintainer (maybe by opening an issue to it).

For what it’s worth, I am working on support for PEP517 build systems in Void Linux. I’m using pip at the moment because, as @FFY00 points out, it is the only viable installer. In the future, it would be nice to move to something like pypa/build and some frontend to pradyunsg/installer.

I believe the Void build style successfully forces pip to behave, but some behaviors I’ve had to override make no sense to an outsider. For example, if I build a wheel with pip wheel [...], why would it try to also build wheels for the dependencies? I suppose there might be some demand for a recursive wheel build process in pip, but this seems like something to opt into with --build-deps, not opt out of with --no-deps. Another issue is the apparent inability to control the name of the output wheel, which means I either need a reliable and simple way to determine the PEP425 compatibility tag tuple for a given output wheel (which does not seem obvious) or, as I do now, resort to globbing to pick up the wheel name.

A simple wheel builder that does not attempt to fetch or build its dependencies (for build- or runtime) and produces output in an easily predicted (ideally, configurable) location, and a complementary installer that knows how to unpack the wheel to an arbitrary root with an arbitrary prefix without trying to fetch dependencies would be most welcome.

2 Likes

I suspect you could propose an addition to packaging to allow a way to override the platform tag for your specific distribution. (Brett knows about my idea here because I’ve pitched it numerous times in the past :wink: ) But other than the platform tag, this is deliberately under the control of the project being built, as they (should) know their compatibility requirements.

In case it wasn’t clear, I meant linux-sig on mail.python.org (and sorry for not linking from my phone the first time). From the description:

This list brings together representatives and fans of various Linux distributions, addressing issues common to the platform, to make Python-on-Linux more cross-distro consistent and user friendly.

1 Like

I got that :stuck_out_tongue: It’s just that this isn’t Linux specific, that’s why I want a ML for all Python distributors, not just Linux ones.

1 Like

Wasn’t this also discussed on some mailing list?

Yes :stuck_out_tongue:

Commenting on this specifically, my goal when creating resolvelib was to create two abstraction layers, one to handle the abstract idea of dependency resolution, and another to interface with stuff specific to Python packaging. The first part became resolvelib. The plan was to implement Passa as the second part, but we (Dan Ryan, Frost Ming, and I) found out that too many things were missing to make this viable—pip’s internals were a mess to work with, and we don’t have the resource to rebuild almost the entire Python packaging stack from scratch.

This has since changed quite a bit, but there are still many pieces missing. I would say implementing the Python stuff in pip was (and still is) the correct engineering (and project management) decision, especially looking back—we’d likely still be shaving yaks at this point had pip implemented the new resolver as a complete separate library. Shipping the resolver using existing pip parts make it possible to gather early feedback on the resolver.

Oops, this drifted more toward rambling than I anticipated when I hit the reply button. My point is that, yes, the part that actually implements resolution of Python packages should be made a standalone library, and that should be the goal. But implementing the resolver first in pip is the correct path to that goal for many reasons IMO, and seeing it being characterised in the opposite direction made me sad.

3 Likes

I tried looking for a newer discussion on this topic, but this seems to be the most recent relevant topic. I just happened across it due to following the rabbit hole from a FreeBSD bug report. So I’m resurrecting it to add some outside perspective (effectively a linux distribution maintainer for my company).

Currently everything I package up related to Python has included a setup.py file. It sounds like this may change in the future, but I’ve been lucky so far. My current package count for 3rd party python is 252.

I work in a realm that deals with compliance. In order to simplify our lives when dealing with “what you got installed on that box?” questions come up, we have exactly one way software gets installed. We use the packaging system of our OS vendor, in this case that happens to be RPM. Each and every python package we build is managed as an individual RPM. The Python package to RPM is built one for one (one PyPi sdist to one RPM). Dependencies are maintained at the RPM level, Python based dependencies are ignored unless they are build requirements, but those are also handled as part of the RPM build as well.

For me packaging an RPM involves three basic steps

  1. Build the python package: python3 setup.py build
  2. Install the package into a “blank” directory for easy packaging. The %{buildroot} macro is a unique empty directory dedicated to this instance of this package build: python3 setup.py install -c -O1 --root=%{buildroot}
  3. Create a manifest of the files to include in the RPM based on what gets installed in %{buildroot}. The RPM build process itself is very good about telling you if you miss any files in the manifest.

The act of creating the initial RPM spec file (RPM equivalent of setup.cfg) has been automated by me based on the information available at PyPi. The addition of dependencies to the spec file is handled manually by inspecting the setup.py file or the setup.cfg file looking for the requires arguments. If any of the dependencies are new, I recursively go through the RPM creation process for said dependencies.

From the sounds of it any PEP-517 builder such as “build” will work for step 1. It’s just a matter of changing my incantation. The “install” from step #2 is I think the part that the OP was attempting to address as a deficiency. Step #3 is unchanged.

As a distribution maintainer my “requirements” come down to: let me install only the python package of interest in a bare directory structure that will mimic the production installation location with fully “compiled” files – .pyc, .so/.dll included. Any additional features beyond that (dependency listing, optional dependency fetching, testing, automation) are just potential gravy.

As long as I can build Python itself and use that to start building up an individual package at a time to get to my desired state I’m good to go. Unlike normal distribution maintainers I do not have to support end users mixing pip with distribution provided packages. I know that mucks with things, but that’s not my use case. For those use cases I would advocate virtualenv or something similar for the end user. I 100% believe that distribution provided packages and user installed packages should be separate. Virtualenv and/or Docker are adequate for providing that separation.

From the distribution standpoint extra features such as testing are optional. In theory released packages are already tested upstream. As a distribution maintainer it is my job to verify that upstream is doing the minimal job to meet my requirements to include their package. If they don’t I have to decide if I really want their package.

I hope this gives some perspective on at least one additional distribution maintainers philosophy and requirements when it comes to downstream packaging.

Shawn