Building distributions and drawing the Platypus

Sorry @cjerdonek! I meant @pradyunsg’s original question:

and the 3.5 options he offers in the original post in the thread.

1 Like

I agree that we should figure out the right UX here. And @pradyunsg’s options seem about right.

However, I think there’s another (less UX-focused) aspect to consider, which is whether we put the build logic into “the tool”, or do what we’ve been trying to do in other areas, which is to make a reusable library, and then simply call that library from the “official” tool. That would probably mean putting work into making pep517 more robust and complete, and in particular making it the canonical place where “setting up a build environment” logic is implemented.

This I disagree with. If you’re doing <whatever tool> build foo, you could quite easily be building a wheel for foo to be used across multiple machines, possibly via multiple installers. So it 100% shouldn’t matter what installer is used to set up the build environment.

What is important (and I think this is what you were intending) is that the user should have an easy way to configure the options needed for that installer to run, and ideally those options should be automatically picked up from the config options that the user’s “normal” installer uses.

That may mean standardising an “installer configuration” format, or it may mean that the build tool needs a UX to say “use this installer for the build environment”. That’s up for discussion. (And yes, I do anticipate the possibility that if we make pip the build tool, someone will want to do something like pip build foo --isolated-env-installer=poetry:slight_smile:)

Maybe, but would need a lot of work. I’ve already gave up on using that library for tox. In it’s current form I find it makes caching and reusing build environments way too hard. I’m not sure if we can truly make it work for all. In it’s current form is mainly cli targeted, and pip use case.

Fair enough. I’ve put forward my perspective on this in a different thread (Build Environments for PEP 517) since this discussion is definitely OT for this thread, given how it has evolved.

Fair point - but would you agree that the logic involved is complex enough that we need some form of library to handle it? Or is your view that tools should decide to what extent they handle build isolation for themselves, and implement it internally? PEP 517 itself describes build isolation as something that tools SHOULD implement, not as a requirement, so there’s definitely a case for making it per-tool.

Since @dstufft has already sketched the semantics (which is the harder part) for transforming pip into a universal package manager, today I have tried to think about the syntax (which is the simpler part) and the various command inputs and steps, based on his thoughts:

So given that, a unified tool would need to, at a minimum, enable:

  • Allow installation and uninstallation of projects into an environment.
    • This includes commands for repeat deployments into multiple environments.
  • Introspect the current state of installed packages, and get more information about them.
  • Tools to produce a local directory of artifacts for offline usage.
    • This includes downloading from PyPI, producing wheels, and producing sdists. Basically an analogy of “install, but save to a local directory”.
  • Allow upload of existing artifacts.
    • This could also possibly include command line support for things like yanking a file, managing a two phase release, etc that we might add in the future.
  • Build artifacts such as an sdist from a directory, or a wheel from a sdist.

I think that’s it for what a mvp of such a tool would look like?

There are also additional features that one could imagine it having, such as:

  • Enable a “binary only” install mode ala pipx.
  • Ability to bootstrap a project, potentially from a user supplied template ala cookiecutter.
  • More commands to manage the development lifecycle such as a lint command, a test command, or a format command that would paper over differences between linters, test harnesses, or formatting tools.
    • This could really be expanded quite far, things like building documentation, running benchmarks, etc.
  • A generic “run a command” feature ala tox or npm run.
  • A way to run a command from a project in development, ala cargo run.

For the moment I have only considered the minimal commands of his first paragraph. Here after is my attempt.

Installing

$ pip install {project}
  • wheel -> download -> wheel + wheel dependencies -> install
  • wheel -> download -> wheel + wheel or sdist dependencies -> build -> wheel + wheel dependencies -> install
  • sdist -> build -> wheel -> download -> wheel + wheel dependencies -> install
  • sdist -> build -> wheel -> download -> wheel + wheel or sdist dependencies -> build -> wheel + wheel dependencies -> install
  • repo -> build -> sdist -> build -> wheel -> download -> wheel + wheel dependencies -> install
  • repo -> build -> sdist -> build -> wheel -> download -> wheel + wheel or sdist dependencies -> build -> wheel + wheel dependencies -> install
  • download -> wheel + wheel dependencies -> install
  • download -> wheel or sdist + wheel or sdist dependencies -> build -> wheel + wheel dependencies -> install
$ pip uninstall {project}

Inspecting

$ pip inspect {project}

Fetching

$ pip fetch --wheel {project}
  • download -> wheel
  • download -> sdist -> build -> wheel
$ pip fetch --sdist {project}
  • download -> sdist
$ pip fetch --any {project}
  • download -> wheel or sdist
$ pip fetch --wheel --dependencies {project}
  • wheel -> download -> wheel + wheel dependencies
  • wheel -> download -> wheel + wheel or sdist dependencies -> build -> wheel + wheel dependencies
  • sdist -> build -> wheel -> download -> wheel + wheel dependencies
  • sdist -> build -> wheel -> download -> wheel + wheel or sdist dependencies -> build -> wheel + wheel dependencies
  • repo -> build -> sdist -> build -> wheel -> download -> wheel + wheel dependencies
  • repo -> build -> sdist -> build -> wheel -> download -> wheel + wheel or sdist dependencies -> build -> wheel + wheel dependencies
  • download -> wheel + wheel dependencies
  • download -> wheel or sdist + wheel or sdist dependencies -> build -> wheel + wheel dependencies
$ pip fetch --sdist --dependencies {project}
  • sdist -> download -> sdist + sdist dependencies
  • repo -> build -> sdist -> download -> sdist + sdist dependencies
  • download -> sdist + sdist dependencies
$ pip fetch --any --dependencies {project}
  • wheel -> download -> wheel + wheel or sdist dependencies
  • sdist -> download -> sdist + wheel or sdist dependencies
  • repo -> build -> sdist -> download -> sdist + wheel or sdist dependencies
  • download -> wheel or sdist + wheel or sdist dependencies

Publishing

$ pip publish --wheel {project}
  • wheel -> publish
  • sdist -> build -> wheel -> publish
  • repo -> build -> sdist -> build -> wheel -> publish
$ pip publish --sdist {project}
  • sdist -> publish
  • repo -> build -> sdist -> publish

Building

$ pip build --wheel {project}
  • sdist -> build -> wheel
  • repo -> build -> sdist -> build -> wheel
$ pip build --sdist {project}
  • repo -> build -> sdist

Notes. — For all the above commands:

  • The --wheel option is the default.
  • The --wheel and --sdist options can be combined.

All the above commands can build except the inspecting command.
All the above commands interact with the network except the inspecting and building commands.

Most of the commands look fine at first glance, although I suspect different ideas would appear when (and only when) it actually get implemented and used in the wild. One particular part I don’t really understand is inspect though; I didn’t see it mentioned up-thread, and there’s no explanation on what it does.

Also, linking back to When you kick the packaging hornet's nest on Twitter, the hornets seem to want an opinionated, KISS solution, there are some concerns (mine included) to implement the Platypus in pip. There are multiple reasons, including 1. the current design and implementation deviates too much from the proposal, and 2. most pip contributors do not want it to fit the proposed role.


Edit: This thread also kind of overlaps with Developing a single tool for building/developing projects (which is split from the Twitter hornet nest thread).

1 Like

I was again wondering where exactly we are at right now toward this Platypus thing, and ended up drawing this graph:

Blocks marked green are components we already have, and others yet to be standardised (has non-official competing solutions, or not even built yet). All blocks beside pip install and PEP 518 are clearly only needed by only either package and application development, which is a strong indication to me that we probably want two tools (or one tool with two distinct aspects), one for people releasing packages, and one for people installing them (package developers probably need both for their development workflow, but wouldn’t be using them at the same time).

Some personal thoughts:

  • There are a lot of talks about a new manifest and lock file format (i.e. declare dependencies in pyproject.toml, but those alone won’t solve the problem, and can wait until other things are solidified.
  • We are close to a universal package development frontend a la Flit’s command line interface. The only essential missing part is editable installs; others (e.g. incremental builds, extrapolating with external build tools) are all backend-only and can be improved incrementally.
  • Interpreter discovery (how to find Python interpreters in the host system) is a vastly underspecified space, especially on POSIX. There are multiple efforts now, including PEP 397 (py.exe on Windows), @brettcannon’s Python launcher, and Vritualenv 20.0’s Python discovery, but at some point some universal rules are needed for components to interop.
  • Given the relatively steep learning curve to virtual environments, the tool probably needs to hide the implementation behind some abstractions. Interpreter discovery from the previous point would help, but it still leaves the topic of how to manage multiple environments. I’ve been experiementing stuff in this area, but there more interest is needed.

Edit (2020-02-19): I added a new branch tool management, which is basically to standardise (and improve if necessary) pipx currently does. I feel most people already agree it is a good way to install Python applications (e.g. Black) if we continue to deploy them on PyPI, so what’s left is not much different from the virtual environment management thing (described in the next few messages) so tools can interop instead of relying pipx (or whatever we standardise it into) to support every possible use case.

1 Like

Can you expand on this? Not sure I follow.

My interpretation was that this is about having tools that manage virtual environments for you. Things like tox/nox would seem to fit into this area to an extent, as would tools like pipenv that take the idea of the user having a “project” directory and manage the environment associated with that project for you. I assume that @uranusjr’s own pyem would be another example of this, and maybe even “thinner” wrappers like pew and virtualenvwrapper.

Personally, I can definitely see the need for some sort of tool here - I mostly just use “raw” virtualenv/venv, but I do find that the admin around managing environments is annoying. But I’ve never yet really tried to pin down exactly what I’d want from such a tool - which I what I took @uranusjr as referring to when he said “more interest is needed”.

Yup, that’s what I mostly have in mind. To back up a little, a lot of modern Python users (coming from other ecosystems) hard-couple the concepts of project and environment. This leads to them feeling the activate/deactivate step redundant. But people used to virtual environments (myself included) most definitely want to keep being able to switch between environments within a project. Both camps have their tools that expose the appropriate abstraction. That’s fine.

The problem, from what I can tell, is that tools don’t interop. You have Pipenv and Poetry if you only want one environment, Tox and Nox if you want multiple environments, but the formers don’t do task runner things well, and the latters don’t do dependency management; Pipenv doesn’t install packages into a Tox-managed virtualenv (without hacks), and Tox can’t run tasks with a Pipenv-managed virtualenv. This leads to tension between the two kinds of users, and virtualenv gets bad rep because it (being the shared implementation between both usages) seems not up to the needs.

One way to consolidate is to define standardised locations to place virtual environments within a project (my PyEM is trying to figure out a good scheme), and let tools share environments and complement each other. So users don’t need to know the underlying virtual environments that make Nox and Pipenv work, and pipenv install would satisfies whatever nox needs to run its task.

2 Likes

IOW have a way to say, “this is my Python 3.8 virtual environment, tools, so everyone who needs that version of Python should just use this virtual environment”? That way recreation is unnecessary and tools can just piggyback on some other tools that previously created the virtual environment?

I have tried to structure the Python Launcher for UNIX to eventually be turned into a wheel on PyPI so that there’s only one instance where the discovery logic needs to be implemented. So my hope is that once I magically find the time hit 1.0 with the Launcher for CLI usage I want to then work at making it a PyPI package for those that need the discovery aspects. That, though, will require making it a universal launcher even on Windows which is a bit more work.

Talking with my Python extension for VS Code hat on, we find a lot of users have no idea about virtual environments, so when we recommend it we are recommending a new concept to them (same goes for when we tell conda users to create a conda environment; a lot of users just abuse the base environment). There is definitely an educational step here of letting people know that virtual environments exist. After that it’s managing them (I had a twitter thread about naming the director you install into and boy were there varying opinions!).

Yup. Practically it’s more complicated since you’d also have to deal with 32/64-bit, platforms (for e.g. WSL), ABI, Python implementation, etc. The greatest missing piece here IMO is how tools can resolve to the same one when the user asks for simply “3.8” or “PyPy” when multiple virtual environments are created, i.e. some kind of identifier for an interpreter. Maybe we can reuse some knowledge from wheel tagging, but I’m unfamiliar on that area.

As a side note: This is also one of the larger roadblocks hit by PEP 582 __pypackages__; I think any improvements to the current virtual environment usages would need to solve it first.

That’s also an issue with the Python Launcher (which I have been avoiding by not caring about bitness :grin:).

Probably not because I don’t think most people are very familiar with what cp38 means or would want to bother saying py38 when all they care about is the version. Really all people care about is Python version, occasionally which interpreter, and very rarely bitness (and does everyone remember how bitness is specified on the various OSs for wheels? I wrote the code and I can’t :wink:). But having said that I’m sure someone is now going to tell me they have a use-case of virtual environments for the same Python version, interpreter, and bitness but with differently dependencies installed. :wink:

But yes, coming up with a general naming/reference scheme would solve that issue. It might require storing more details in pyvenv.cfg, but that’s not a huge issue.

They only care about version until they install a package that has a native module in it. Then suddenly they care a lot about platform and bitness :slight_smile:

virtualenv built-in discovery mechanism supports both Unix and windows, and currently allows users to care about: implementation (cpython Vs pypy), version (major/minor/patch), and bitness (32 Vs 64). tox already had all this, and to some extent virtualenv too, so consolidated code to virtualenv entirely. In theory could be it’s own package :man_shrugging:

https://virtualenv.pypa.io/en/stable/user_guide.html#python-discovery

I couldn’t find docs on how tox handles this. Is there a page listing how to specify the varieties of options?

Probably the first step is to decide on what format we want to standardize on for specifying what interpreter you want and get that written down in a PEP or something. Once we have that then we can make sure virtualenv, tox, Python Launcher, etc. all support the same format and the Python-based ones can all standardize on a library implementing it (Python Launcher probably can’t as it needs to be in C/Rust and I don’t think tox or virtualenv want an extension module dependency).

Ok, I’ll admit that I didn’t actually read the whole thread, but one thing I read over and over is “should pip do everything?” And while I think this is a good question (e.g. wheel is an external package that demonstrated enough utility to become the defacto distribution format), I think the very fact that pip implements a PEP 517 interface with setuptools already suggests that it should do everything. Or at least, it should be able to build from source.

I gotta say folks, PEP 517 feels like a godsend. I’ve been trying to figure out how to package python packages “the right way” for the better part of 2 years now, and it’s not very easy to figure out. It took me longer than it should have to realize that wheels are the way to go. It also took me forever to figure out that setuptools is in fact the preferred way to package. PEP 517 makes 2 afirmative statements that make my life much easier: (1) “Here’s how to package python”, and (2) “Here’s the default tools for packaing python”. This at least makes it clear that there’s an interface and a set of options that aren’t (or are) non-conformant hacks. It also finally provides the interface necessary for me to decide as a packager to use non-standard tools and not be afraid that users won’t be able to install my package. :relieved:

As a python user and package author it would be very confusing if the tool that’s capable of building a wheel using a long-standing interface (i.e. pip wheel) suddenly disappeared because building packages got better :thinking:. So again, as a user here, I’m pretty happy with a discussion on what would feel like an implementation details about publishing another package for pybuild as the equivalent to pip wheel but for PEP517. It may be a very good idea to separate out that functionality into another tool, if that tool is also the default choice for pip, (and replacable via pyproject.toml). It may also make sense to first implement the interface in pip and later pull it out into a shared implementation. But frankly, seeing a pip sdist will just make so much sense to me given the existing tooling, names, pep517 interface, etc.

Furthermore, all python developers are package maintainers, right? Except in corporate environments where there’s enough money/staff for python build engineers. The notion of user modes as producer vs consumer is a false dichotomy, except for those rare python developers who never share their code with anybody. And frankly, are we considering them a core demographic to consider? I mean isn’t the very implementation of pip and pypi.org and ssh+github links in requirements.txt a value statement about sharing open-source code?

Anyways, I love what y’all are doing. pip makes the very idea of distributing shared code seem a world easier than what I’m usually dealing with (c++ distributions :sob:). I hope my comments can shed a little bit of light on what users are wanting.

Thanks y’all

1 Like