Removing setup.cfg and setup.py from the packaging tutorial

Perhaps mention that everything can be configured by hand (with examples) and this doesn’t vary by backend. You could then introduce the fact that some (many?) backends have their own CLI tools which make this easier, and then perhaps illustrate this for each backend you cover in the tutorial?

I think it’s important for beginners to realise the CLI tools aren’t required in most cases, what matters is the keys in pyproject.toml, but that there are helper applications for the boring stuff.

A

The way I see it there are both build backends and project management tools with build backends. If you are already comfortable with project management you might feel overwhelmed by documentation that covers the tool’s environment creation, project upload features etc. in addition to the build backend (which is the information you actually wanted).

2 Likes

Since there is a discussion about adopting the Diátaxis guidelines for documenting Python, I noticed that the Packaging guide is following the same principles.

One of the recommendations for tutorials is to remove all cognitive load about options and alternatives. The idea is to get an unfamiliar reader to complete an educative project (such as build a package from some example code).

The packaging tutorial already diverges a lot from the recommendations by introducing abstract context and explanations. That’s not bad in itself (Diátaxis puts a lot of emphasis on documenting a product, while Python packaging is an ecosystem) but I find it true that the tutorial generates a lot of cognitive load.

The hardest-to-digest part is possibly “Configuring metadata”:

  • it presents two alternatives, static or dynamic
  • it explains each of them, and it compares them
  • it recommends one of the alternatives, before even telling the reader they need to make a decision
  • then it proceeds to document both alternatives (am I supposed to have made my mind up at this point, or should I read on both branches and decide later?)
  • it provides what feels like a compact reference to all the configuration options
  • it makes a mention of my “username” for the first time in the document, in bold characters, suggesting I’m supposed to do something important here, but giving no instructions

All of this comes immediately after the casual mention that this step might be “completely different” if I were to choose a different build system. This is enough load to freeze even experienced folks (I know I did).


My suggestion:

  • Amend the tutorial to adopt PEP 621 and choose any backend. The chosen backend here is not a recommendation. It’s a tool that will unblock a learner so that they can complete the tutorial
  • Mention in the tutorial that “we will use X as a build backend” when build-system is set.
  • Delegate the more ambiguous tasks to guides:
    • New “How to choose a build system” guide: a decision tree mostly about PEP 517 vs legacy, setuptools vs others, and specialized builders; not a place to compare Poetry vs Flit vs Hatch vs the hottest new alternative (but it might be worth mentioning what kind of implication the choice might have)
    • Refresh “Packaging and distributing projects” to specify that it’s setuptools-specific, cover everything that right now is in “Configuring metadata” in the tutorial, update for PEP 517
  • Emphasize “guides” more throughout the documentation, because they have the potential to do a lot of the heavy lifting. My guess is that users come with specific tasks to accomplish, and the tutorial cannot have the burden to cover all of them.

I don’t see how backend agnosticism is a goal that supports learners. The tutorial would be worse with more branching. The intention here seems to be to avoid accidentally blessing one backend over others. If that’s the case, can it be achieved without hindering the first learning experience? Some options:

  1. In-line with the setting of build-backend explain that it’s a choice and link to a different resource (be it the “How to choose” guide I suggested above, or any discovery resource)
  2. The same, but in a box that is obviously not part of the tutorial itself
  3. Emphasize the part in “Next steps” which already recommends considering alternatives, and link to a dedicated resource
  4. Create and maintain a new minimal backend that implements PEP 621 and can build a wheel, but will never support anything more complex than the tutorial itself. It could be a wrapper on flit_core or hatchling or equivalent, but that would not be apparent to a user.
11 Likes

I agree it is preferably to just pick one that supports PEP 621 for use in the tutorial. At some place there could be a small overview comparing some of the back-ends.

Related discussion suggesting Meson as preferred back-end.

Sorry for the delay, I’ve been busy (and will be a bit busy though next week). You can guess my side - I’m the author of the PR to do this with tabs (from last December). I’m also the main author of scikit-hep/cookie, which supports 11 backend choices largely because of PEP 621. It seems there are a mix of opinions, and I’m not sure there’s a consensus yet.

To me, there are a few important points. First, this is not just a workshop material page for something that gets taught, it’s also a self-education resource, and one of the key ways many users learn packaging. If you are coming from a different language or learning this for the first time, there’s a very good chance you will find this page and learn from it. It’s not just something that is used to support in-person/zoom lessons.

The tabbed display of a few backend is a visual way to clearly indicate exactly what changes when you switch backends without having to go into a bunch of text telling users what is general and what is specific to the selected backend. If you don’t actively click on tabs, you will never see anything but the default - I don’t expect it to cause that much branching. Any other non-visual form will add a bunch of text explaining things about backends. The suggestions above include “include a small overview”, etc; this is text where a visual could work better.

Also, the tabs explicitly forces this tutorial to remain generic. If you “select” a backend (hatch or setuptools), it can easily become a hatch or setuptools tutorial, instead of a generic packaging tutorial. In fact, the existing PR is one of the driving forces behind setuptools’ PEP 621 backend working “out of the box” with this layout - so that it could be used exactly the same way Flit and PDM (and now Hatch) could be used without tool.setuptools settings. This is exactly what I’d like to see come out of this - backends trying to work without special tool settings for the “basic” packaging needs so they can support this tutorial. When Poetry finally adds PEP 621 support, I’d like it to work out of the box with this tutorial (and maybe even add it to the tabs!).

I’d recommend trying it out; seeing what the problems are, then changing if it doesn’t work. It’s a webpage and can be changed - removing the tabs would be trivial. If there’s a sudden influx of users who say this is way too confusing, change it! In my opinion, the current setup.cfg/setup.py tabs couldn’t be worse.

PS: I’m interested in working on the PEP 621 scikit-build backend for CMake Python extensions. I don’t want to have to write a tutorial telling users how to use pyproject.toml to set the name, etc. I’d much rather just point them at this tutorial; but I can’t do that if it’s a setuptools or hatch tutorial. But if it’s a generic tutorial (and that’s best handled, IMO, by the visual method of having a few backends in tabs, even if my backend is not one of them - it’s now visually clear what is specific and what is not), then I can do that. The same is true for mesonpy and any other backends, I think this serves them better being generic, and I think the tabs is the simplest way to be generic.

5 Likes

Great idea!

Does anyone disagree with taking this pragmatic approach?

1 Like

Is there a way to measure it? My guess is that the typical response to confusion or frustration while reading a tutorial would be to look for a different one, not to complain with the authors. The feedback cycle with this kind of stuff could be pretty slow and not actionable, e.g. people saying that packaging and distribution in Python in unapproachable.

This is the case whichever choice is made, but my suggestion was to stick to proven principles (and maybe experiment somewhere else?) and exclude any kind of choice in the path of the tutorial.

Hence the suggestion to have specific guides alongside the tutorial, and not put all the burden on that single page.

It seems that the organization of the packaging documentation followed this principle originally, but swayed from it with time, adding more and more to the single tutorial page (which used to have only one way to do things) and leaving other parts with obsolete content (the discussion about wheel vs egg is irrelevant today, and the guides seem to give setuptools for granted).

That’s a good point, and it reinforces my belief that certain parts of the documentation are better served by “how-to guides” rather than by the one tutorial page.

1 Like

We get lots of feedback on the packaging-problems tracker, which is linked at the bottom of the page.

People are already saying that. Leaving this page unupdated because we think it might be 5% more approachable if it didn’t have tabs isn’t really helping. I think with tabs it will still be more approachable than the current page and its setup.py/setup.cfg tabs. The point of the tabs is that any one of them works identically, you can just leave it alone if you don’t want choice.

This has been on Redirecting… since before this started. I showed this during a lighting talk at PyConUS 2022. Exactly where else do you want it "experiment"ed?

There already is choice in path (setup.py/setup.cfg). You are proposing a change (no choice).

The believe here would be to have custom, one purpose pages for every use. One tutorial only intended to be worked through, one how to guide intended to be studied, one reference doc intended to be looked up, and one explanation describing why things are the way they are.

However, there are competing issues born from practicality, and from user experience.

First, the more pages you have, the more work it is to maintain them. The easier it is to have pages go since 2013 without being reviewed. A small number of well maintained comprehensive pages is much better than a huge number of specialized pages that are half out of date. The guide page, Packaging and distributing projects - Python Packaging User Guide, is one of over 20 pages, and is not at all easy for a newcomer to digest. And it also needs to be updated, which no one has volunteered to do. Maybe if that page was fully updated to pyproject.toml and made easier to digest, then we could revisit removing the tabs from the tutorial?

Second, by splitting things up more, you are moving the “cognitive load” from a small portion of the page to selecting the page itself. There are a ton of pages (over 20) in guides, and it’s hard to see the difference between the tutorial and the guide and the specification - to me the guide seems much less readable and more incomplete than the tutorial, but not as detailed as the specification; I don’t think I’d ever point someone at it in its current state (and it is setup.cfg based now). There are entire guide pages like using-manifest-in that I’m not sure what we should do with.

Maybe, ideally, the guide could become a more detailed tutorial, and the tutorial could be made even simpler with more links to the guide, and we’d have better separation. But for now, the tutorial (which is linked from the main page) is the main source for starting in Python packaging.

The original pages were likely based on idealistic principles. The current pages are based on experience working with the pages and feedback from the packaging-problems issues, and the lack of volunteers to update all 30-40 pages of material. A small number of well maintained pages is more user-friendly than a large collection “detailed” pages that are hard to find and outdated. If we want to move (back) that direction, that collection of pages needs to be update before we start removing information from the tutorial.

I don’t know if I can name a successful Open Source project that actually keeps 4 copies of documentation up to date and approachable.

2 Likes

Also, having tabs helps in transition. Currently everything else on packaging.python.org is based on setuptools, not hatchling. During the transition time, while all the other guide pages get updated, having the tutorial support both Setuptools and Hatchling helps with that transition.

Why not “experiment” with a no-choice tutorial in the Setuptools or Hatch docs? That’s really where such a tutorial should go anyway, IMO. packaging.python.org being tool-independent seems very much in the spirit of PEP 621/517.

FYI, I had some non-tab ideas if tabs are not supported. One was to have a bit of JS that allows a user to make a link https://packaging.python.org/en/latest/tutorials/packaging-projects?requires=hatchling&build-backend=hatchling.build which would allow any other tools to link to this page in such a way that it actually shows how to do it with their package. The downside is that doesn’t actually visually show a reader what needs to be changed when they just visit the page, which (currently) is important, and much worse as raw text vs. a visual tab. It also would not help authors keep this page generic. The plus side is it would work with all backends that want to provide the link, rather than a selection of them.

Another idea would be to have an expandable “details” box with the “tabs” in it (thought they might not be selectable tabs, just a linear collection of a few choices). It might not seem like it, but I also want to keep the details of selecting a backend out of the path of the learner, I just think that a) it’s still important enough it needs to be there somehow, and tabs that require a click is the simplest and most visual way to do it, over yet more inline text.

I appreciate you engaging pragmatically with the suggestion. Since you already made the PR that started this conversation, and the alternative would require more work that no one has volunteered to do (yet), I see the weight of your points.

My worry is that—no matter which solution is adopted now—there is no long term guidance on how the documentation can improve. Pretty much everyone seems to agree that the current state of the page is not ideal. But there is no consensus on the direction for improvement because there is no agreement on the goals for the tutorial page. Should it:

  • introduce first-time packagers to the ecosystem?
  • be a step-by-step guide that a user can come back to as needed?
  • allow users (new and old) to discover all the alternative tools?
  • document the officially-endorsed way to build packages?

These goals are potentially conflicting for a single page. I realize the issue here is the lack of volunteers, but I’m ready to bet that volunteers would materialize if a direction was clear. Since my suggestion was gathering some very early and very informal approval, would it be welcome if I volunteer a concrete proposal? I cannot maintain this long term, but I can get it started.

So a plan would be to start with https://github.com/pypa/packaging.python.org/pull/1031 (assuming there is consensus to move to PEP 621 now) which is clearly a net improvement over the state of things, and move towards a different structure if people (including myself) work on it later.

I have some nitpicks and comments about the other points that I wish to clarify but are maybe not part of the main discussion:

other comments

I don’t agree the feedback on the tracker is “a lot”. It’s not a significant number, and the tickets seem to be biased towards people stuck because running a command got unexpected results on their environment. My experience is that this is not a good sampling (it suffers from heavy selection bias), and it doesn’t account for people who got frustrated and gave up.

That was my point. People have been saying that, they keep saying, that and that kind of “meme” (in the proper sense of a transmittable unit of culture) tends to be sticky and persistent. It would take years to revert the trend, even if the packaging guide got perfect overnight and the standardization efforts proved ideal.

There is no way to establish that one change improved the state for the community of users. It’s not sound nor realistic to only react on feedback. The whole argument was in favor of adopting guiding principles (to be clear: that was before you opposed the practical issue that it takes unreasonably more work to maintain).

I certainly did not argue in favor of not updating the page, and I’m not sure anyone did. I choose to interpret this as a worry of paralysis by analysis, that is, we keep discussing instead of accepting one option. But there was already a concrete alternative proposal (PR 1085) and more came up. The discussion seems to still have value, even after PR 1031 or an equivalent proposal gets accepted.

This carries the implication that adding tabs comes as no cost, but I disagree. There are at least two competing semantics for tabs as used in the tutorial right now:

  1. allow users from different environment to approach the tutorial, because the difference is irrelevant to the lesson (Unix vs Windows)
  2. present different choices that affect the lesson, and that the user should decide upon, but are deemed to be indifferent with regard to the outcome

You can decide to conflate them for practical reasons, but these are not the same. The first one is generally inevitable. Adding a second one dilutes the meaning of tabs, so it weakens overall clarity. Also, a user would still have to expand all tabs to understand why the options are there. They cannot decide to “leave it alone” if they don’t know it doesn’t matter; and they cannot know it doesn’t matter if the guide very clearly suggests that a choice is necessary, and they don’t know the characteristics of the alternatives.

Again, this was my point. What I call “experiment” is documentation that doesn’t follow a previous framework, but is based on the strong intuition/opinion of its authors. A separate space might get a better opportunity to collect richer info about its impact, e.g. if it gets linked more often than the official guide. In a way, this is what happened to the packaging guide when it de-facto replaced the setuptools/distutils official documentation. Maybe my point here was that “experiments” deserve to be named and treated as such.

I don’t believe in purity of separation, but I do believe that different users have different needs, and to try to satisfy all needs with a single resource is not going to satisfy anyone’s needs. One-purpose pages might not be realistic (nor desirable?), but a slightly more structured separation between introductory material and task-oriented guides might be viable. No need to strawman.

Agreed.

I don’t think this is what is meant by cognitive load. If you are exploring the documentation to find the resource which better suits your needs, your challenge is not cognitive load but something we could call “discoverability”. On the other hand, if you are following a tutorial, you get the best results if you manage to get through completion so that you can move on with your learning.

Discoverability can be helped by suggesting a path, e.g. “these are the problems involved in packaging”. Discoverability would also benefit from context-specific links, both internal (within the guide) and external (whenever you link to a specific resource for a purpose), both for humans and search engines.

For “starting”, yes, but not necessarily to accomplish every single packaging-related task. I know this is revisiting the whole discussion and I won’t rehash every single point. Only one point: the tutorial is not the whole packaging guide. It’s a critical resource which has served an immense role in the community, but we can’t assume that this is going to be its role forever.

I’m sorry but this feels dismissive. Of course experience has a massive weight. But the current state of the page is not good, and you cannot attribute that only to lack of volunteers. It’s very well possible that bad decisions were taken, or the direction was unclear, or the “idealistic principles” were disregarded. I’m walking on a dangerous ledge and I don’t want to criticize anybody’s work; but I don’t think that striving to improve is the same as disregarding the work and experience of people who contributed to the guide, which seems to be one underlying implication you are making. I apologize if you that’s not what you had in mind at all, but it’s one possible reading that I wanted to challenge.

Agreed. I don’t want to remove information.

It can be tool-independent in the sense of PEP 621, which is what everyone on this discussion wishes for. But I don’t agree this implies trying to cover multiple tools.

2 Likes

I say go with what Henry has created and we iterate as necessary. He’s done the work and I think everyone agrees it’s an improvement, so I would rather not let striving for perfection hold us up any longer.

Unfortunately, I don’t hold that same enthusiasm. I would say Henry’s work to do this tutorial rewrite is showing what people want (and Henry does have experience in teaching folks packaging, so I do trust his judgment).

8 Likes

I’m willing to (slowly) help improve the guide pages, and I think any improvement is welcome. But it’s not just getting a contributor to contribute, it’s also reviewing PRs and getting the change in. @bhrutledge has been a huge help here. Still, getting anything larger than a small change in can be quite a lengthy process - that doesn’t encourage these potential contributors. If the guide pages get updated, filled out, and nicely cross linked, I think we can try having a better separation of pages and reduce the tutorial in favor of more cross-links and a better link from the main page. But I’m a huge believer in iterative development - a massive PR is much harder to sell than a bunch of tiny ones, and the same thing is true for docs. I think we can iterate to a better place, and it’s okay to iterate. This is an iterative path forward. And if the tab choice really does seem to have a negative impact, it can be removed.

I’m also okay with reducing the choices. Just Hatch and Setuptools would still fulfill the most important purpose I’m aiming for, visually showing exactly what is interchangeable. I chose to also include Flit because that’s the third PyPA PEP 517 backend and it was originally the only one that was working, and I chose PDM specifically because it’s not in the PyPA. Happy to take opinions (though AFAICT everyone is either in favor or not in favor, and no one is that worried about which choices are there).

I do agree with many of your points, and I think we are actually trying to get to the same goal - I just think our approach to the end is different, and we are viewing the learner a bit differently (to be fair, I’m heavily in the sciences so could be seeing a biased selection).

Other Comments

My point here was we used to have a simple tutorial during the period where people started saying this was unapproachable. They didn’t just start saying this because choices appeared, they were saying it because the original choice was painful and problematic. Now we have three choices (setup.cfg, pyproject.toml), and I think each one was better than the last, though people have really started to complain about the choices - but complaining about choices doesn’t mean you should just stick to one bad choice. And people are already complaining about having backend choices, but I think that’s (mostly) healthy right now.

You said this should be experimented on elsewhere. I experimented on it elsewhere. And you said that was your point. I should not have tried it first elsewhere? I sorry, you lost me on this one.

You don’t learn by coping things without understanding. I can easily provide a copy-paste example (from a cookiecutter, for example!) but that’s not a tutorial. You are trying to teach. The goal is to get readers prepared for what they actually want to do (make some package), not to be able to copy-paste and get something to run (that’s important, you want it to work! But it’s not the point, teaching the basics of packaging is the point). I consider knowing what is safe to change to be an important part (at least currently) of packaging. I think the “basic” learner will complete this with hatchling and be happy. And then if they go on, they will need to switch it to setuptools because everything else on the site is still written for setuptools. :person_facepalming: If we rework the rest of the pages to not be setuptools specific, maybe that will go away. If we move toward a better guide page, then maybe we can move this “next step” over to it. Etc. But for now, we need to be sure users learn what is interchangeable.

My point is that the current tutorial including the current choice was built over experience and is based on how people are interacting with this page today. I don’t think it’s wrong to set a goal, but I think the goal should be based on improving the existing guide pages instead of blocking an improvement on the existing tutorial page. For example, a couple of changes a year or so ago increased the number of issues open on packaging-problems, but it was something people needed to learn, so that was not “wrong”, the goal isn’t to avoid mistakes, it’s to teach.

Sorry if I sounded combative at all, didn’t meant to be, but I’ve been trying to get this updated for over six months now, and it’s been held up primarily due to differing options on this one tab section, and I’m not the only one who likes it.

2 Likes

Thanks everyone for the thoughtful discussion. My takeaways:

  1. There’s universal desire to modernize the tutorial
  2. A majority of folks think hatchling is a good default
  3. Opinions remain mixed on if/how to represent multiple backends, w/ good arguments for both
  4. Given the effort put into PR 1031 to implement 2 and 3, it’s worth merging to satisfy 1

I’m going to do a fresh review of 1031, with the aim of pushing copy-edits and merging in the next week.

5 Likes

One question that’s come up in this thread is: how do we find out what users need from documentation and test new approaches to learn whether they are more successful? It’s worth skimming past efforts in this area, e.g. UX Research Results - pip documentation v22.0.dev0 the user experience tests done in 2020 during pip work.

I’m a beginner to Python packaging, and have had a hard time getting some basic things working. I appreciate all the docs that are around these days – very comprehensive and well written, thanks! My struggle has been with the changing landscape and wrestling with docs in different states of ‘currentness’. I’d like to offer a couple of thoughts on this discussion:

  1. As a beginner, I wanted to keep things simple and get my project working with setuptools first, before trying to learn another tool. I recommend keeping setuptools first amongst the examples; but I’m very happy to see other tools showcased, so I can learn about them (I might even switch to one of them if I see that the configuration method makes more sense to me.)
  2. I really love the tabbed layout on many of the pages where setup.py and setup.cfg are both illustrated. My project was established a long time ago with setup.py, and having read that .cfg is preferred, I was working to switch to use setup.cfg only. I ran into difficulties on quite a few pages where it still only shows setup.py and I couldn’t see how to translate a dictionary parameter (package_dir) into the format required in setup.cfg. So, getting those updated would be really helpful, too.

Yes, as a learner, I want the tutorial to be generic, giving an overview of the process/system. But, I also want the specific examples so I can actually apply it to my project; the code/file samples are essential, and work well in tabs where there are multiple ways of doing something.

3 Likes

So this would suggest that while you’re a beginner to Python packaging, your project is not. :wink: Since your project already made a packaging decision, that somewhat skews your needs from someone who is new all-up: Python, packaging, and project. The tutorial under discussions is more aimed towards this latter group as I would assume that if your project preexists, then you have some support from whomever created the project to begin with.

2 Likes

Yes, you’re right that I’m not completely new to Python stuff, but I’m the one who created the project some years ago, so my support person is just as ignorant as me :slight_smile:. I configured a setup.py back then, which built my Sphinx docs, and that’s all–I distributed the project as a zip file. Now, I’ve split off the backend into a library, which I want to release properly on pypi. Next will be to make a proper install setup for the main app.
In any case, these kinds of tutorials are also useful for people joining an existing project and trying to understand how packing works… just hoping to bring some fresh-eyes perspective to this discussion.

I consider myself a beginner when it comes to Python packaging too - I followed the current version of the tutorial a few months ago to build my first Python package. I just wanted to add to this discussion that for newcomers, it might be best to still default to setuptools for the simple reason that it is the preinstalled ‘default’. I suspect that for complete beginners, it would perhaps be confusing and a little annoying when the first tutorial they read strongly suggest some other tool by making that the default selection. I already have setuptools, why do I need to installed something else that fulfills the same purpose? I honestly don’t think that the more verbose output of setuptools is a huge issue, at least for me I just ignored it :smile: And after Implement PEP 660 allowing both "strict" and "lax/loose" approaches by abravalheri · Pull Request #3265 · pypa/setuptools · GitHub is merged, I guess there is not that much difference between a setuptools pyproject.toml-only config and the alternatives?

1 Like

This isn’t necessarily the case any more (in fact, it never was de jure, only de facto), as least as far as

is true and these standards are fully adopted. Setuptools isn’t part of Python itself or its official distributions, and distutils is long-deprecated and finally being removed in the 3.12. Furthermore, it doesn’t really matter what actually happens to be installed in your environment, since whatever backend you’re using is going to be downloaded and installed in the isolated PEP 517 build environment anyway, and you don’t ever have to install it directly unless you want to use it as a frontend/etc. (instead of pip and other standalone tools).

1 Like

Do you mean that there are concrete plans to not have setuptools included in standard Python installations as well as on every environment created with python -m venv by default (at least on Mac and Windows)? I’m aware that setuptools is not part of Python itself and never was the de jure standard, but as long as it is pre-installed in this way for the majority of users I don’t think that really matters from a user perspective.

After setuptools adopted PEP 660 and using setuptools with pyproject.toml only is not very different from the alternatives (at least for simple, Python-only projects), I think there needs to be a really really good reason to implicitly tell beginners “Hey, instead of using the app you already have installed (setuptools) you might want to install hatch/flit etc”. Just to be clear, flit, hatch and other alternatives are great and I like the idea of showcasing them in the tutorial. I only think that for beginners, it’s more straightforward to first showcase the tool they most likely have installed already.

1 Like