Draft PEP: Recording the origin of distributions installed from direct URL references

(Stéphane Bidoul) #1

This is the Draft PEP proposal that follows from Pip freeze, vcs urls and pep 517 (feat. editable installs).

I’ll integrate remarks at https://github.com/sbidoul/peps/blob/source_url-sbi/pep-9999.rst.

Looking forward to reading your comments.

-sbi

PEP: 9999
Title: Recording the origin of distributions installed from direct URL references
Author: Stéphane Bidoul <stephane.bidoul@acsone.eu>
Sponsor: Chris Jerdonek <???>
Discussions-To: https://discuss.python.org/t/recording-the-source-url-of-an-installed-distribution/1535
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 21-Apr-2019
Post-History: 

Abstract
========

Following PEP 440, a distribution can be identified by a name and either a
version, or a direct reference (see `PEP440 Direct References`_).
After installation, the name and version are captured in the project metadata,
but currently there is no way to obtain details of the URL used when the
distribution was identified by a direct reference.

This proposal defines
additional metadata, to be added to the installed distribution by the
installation front end, which records the direct reference for use by
consumers which introspect the database of installed packages (see PEP 376).

Motivation
==========

The main motivation of this PEP is allowing tools attempting to "freeze" the
state of a python environment to work in a broader range of situations.

This PEP originated from the need to implement `pip issue #609`_:
i.e. improving the behavior of ``pip freeze`` in presence of distributions
installed from direct URL references. It follows a
`thread on discuss.python.org`_ about the best course of action to implement
it.

Installation from direct references
-----------------------------------

Python installers such as pip are capable of downloading and installing
distributions from package indexes. They are also capable of downloading
and installing source code from requirements specifying arbitrary URLs of
source archives and Version Control Systems (VCS) repositories,
as standardized in `PEP440 Direct References`_.

In other words two relevant installation modes exist.

1. the package to install is specified as a name and version specifier:

  In this case, the installer looks in a package index (or optionally
  using --find-links in the case of pip) to find the distribution to install.

2. The package to install is specified as a direct URL reference:

  In this case, the installer downloads whatever is specified by the URL
  (typically a wheel, a source archive or a VCS repository) and installs it.

  In this mode, installers typically download the source code in a
  temporary directory, invoke the PEP 517 build backend to produce a wheel
  if needed, install the wheel, and delete the temporary directory.

  After installation, no trace of the URL the user requested to download the
  package is left on the user system.

Freezing an environment
-----------------------

Pip also sports a command named ``pip freeze`` which examines the Database of
Installed Python Distributions to generate a list of requirements. The main
goal of this command is to help users generating a list of requirements that
will later allow the re-installation the same environment with the highest
possible fidelity.

The ``pip freeze`` command outputs a ``name==version`` line for each installed
distribution (except for editable installs). To achieve the goal of
reinstalling the same environment, this requires the (name, version)
tuple to refer to an immutable version of the
distribution. The immutability is guaranteed by package indexes
such as Warehouse. The package index to use is typically known from
environmental or command line parameters of the installer.

This freeze mechanism therefore works fine for installation mode 1 (i.e.
when the package to install was specified as a name plus version specifier).

For installation mode 2, i.e. when the package to install was specified as a
direct URL reference, the ``name==version`` tuple is obviously not sufficient
to reinstall the same distribution and users of the freeze command expect it
to output the URL that was originally requested.

The reasoning above is equally applicable to tools, other than ``pip freeze``,
that would attempt to generate a ``Pipfile.lock`` or any other similar format
from the Database of Installed Python Distributions. Unless specified
otherwise, "freeze" is used in this document as a generic term for such
an operation.

The importance of installing from (VCS) URLs for application integrators
------------------------------------------------------------------------

For an application integrator, it is important to be able to reliably install
and freeze unreleased version of python distributions.
For instance when a developer needs to deploy an unreleased patched version
of a dependency, it is common to install the dependency directly from a VCS
branch that has the patch, while waiting for the maintainer to release an
updated version.

In such cases, it is important for "freeze" to pin the exact VCS
reference (commit-hash if available) that was installed, in order to create
reproducible builds with the highest possible fidelity.

Note about "editable" installs
------------------------------

The editable installation mode of pip roughly lets a user insert a
local directory in sys.path for development purpose. This mode is somewhat
abused to work around the fact that a non editable install from a VCS URL
loses trace of the origin after installation.
Indeed editable installs implicitly record the VCS origin in the checkout
directory, so the information can be recovered when running "freeze".

The use of this workaround, although useful, is fragile, creates confusion
about the purpose of the editable mode, and works only when the distribution
can be installed with setuptools (i.e. it is not usable with other PEP 517
build backends).

For the sake of clarity, it is important to note that this PEP is otherwise
unrelated to editable installs.

Rationale
=========

This PEP specifies a new ``direct_url.json`` metadata file in the .dist-info
directory of an installed distribution.

The fields specified are sufficient to reproduce the source archive and `VCS
URLs supported by pip`_. They are also sufficient to reproduce
`PEP440 Direct References`_, as well as `Pipfile and Pipfile.lock`_ entries.

Since at least the above 3 different way to encode the information exist,
this PEP uses a key-value format, to not make any assumption on how a direct
URL must ultimately be encoded in a requirement or lockfile. See also
the `Alternatives`_ section below for more discussion about this choice.

Information has been taken from Ruby's bundler manual to verify it has similar
capabilities and inform the selection and naming of fields in this
specifications.

The json format allows for the addition of additional fields in the future.

Specification
=============

This PEP specifies a ``direct_url.json`` file in the ``.dist-info`` directory
of an installed distribution.

This file MUST be created by installers when installing a distribution
from a requirement specifying a direct URL reference (including a VCS URL
in *non*-editable mode).

This file MUST NOT be created when installing a distribution from an other
type of requirement (i.e. name plus version specifier, or URL in editable mode).

This json MUST be a flat dictionary where all keys and values are of string type.
For the sake of forward compatibility, tools SHOULD ignore values which are
not of string type.

If present, it MUST contain at least one field with name ``url``.

``url`` MUST be stripped of any sensitive authentication information,
for security reasons. The user:password section of the URL MAY however
be composed of environment variables, matching the following regular
expression::

    \$\{[A-Za-z0-9-_]\}:\$\{[A-Za-z0-9-_]\}

When ``url`` refers to a VCS repository:

- A ``vcs`` field MUST be present, containing the name of the VCS
  (i.e. one of ``git``, ``hg``, ``bzr``, ``svn``).Other VCS SHOULD be registered by
  amending this PEP.
- The ``url`` value MUST be compatible with the corresponding VCS,
  so an installer can hand it off without transformation to a
  checkout/download command of the VCS.
- A ``revision`` field MAY be present to reference the
  branch/tag/ref/commit/revision (in a format compatible with the VCS) that
  was requested for installation.
- A ``resolved_commit_id`` field MUST be present, containing the
  exact commit/revision number that was installed.
  If the VCS supports commit-hash
  based revision identifiers, such commit-hash MUST be used as
  ``resolved_commit_id`` in order to reference the immutable
  version of the source code that was installed.

When ``url`` refers to a source archive, a wheel, or a local directory:

- A ``hash`` field SHOULD be present, with value
  ``<hash-algorithm>=<expected-hash>``.
  It is RECOMMENDED that only hashes which are unconditionally provided by
  the latest version of the standard library's ``hashlib`` module be used for
  source archive hashes. At time of writing, that list consists of 'md5',
  'sha1', 'sha224', 'sha256', 'sha384', and 'sha512'.

.. note::

  When the requested URL points to a local directory that happens to contain a
  VCS checkout, installers MUST NOT attempt to infer any VCS information and
  therefore MUST NOT output any vcs related information (such as ``vcs`` field)
  in ``direct_url.json``.

A ``subdirectory`` field MAY be present containing a directory path,
relative to the root of the VCS repository, source archive or local directory,
to specify where ``pyproject.toml`` or ``setup.py`` is located.

.. note::

  As a general rule, installers should as much as possible preserve the
  information that was provided in the requested URL when generating
  ``direct_url.json``. For example user:password environment variables
  should be preserved and ``revision`` should reflect the revision that was
  provided in the requested URL as faithfully as possible. This information is
  however *enriched* with more precise data, such as ``resolved_commit_id``.

Registered VCS
--------------

This section lists the registered VCS, along with precisions on how
to use the ``vcs``, ``revision`` and ``resolved_commit_id`` fields.
Tools MAY support other VCS although it is RECOMMENDED to register
them by amending this PEP. The ``vcs`` field SHOULD be the command name
(lowercased). Additional fields that would be necessary to
support such VCS SHOULD be prefixed with the VCS command name.

Git
+++

Home page

  https://git-scm.com/

vcs command

  git

vcs field

  git

revision field

  A tag name, branch name, git ref, commit hash, shortened commit hash.

resolved_commit_id field

  A commit hash (40 hexadecimal characters sha1).

Mercurial
+++++++++

Home page

  https://www.mercurial-scm.org/

vcs command

  hg

vcs field

  hg

revision field

  A tag name, branch name, git ref, changeset ID, shortened changeset ID.

resolved_commit_id field

  A changeset ID (40 hexadecimal characters).

Bazaar
++++++

Home page

  https://bazaar.canonical.com/

vcs command

  bzr

vcs field

  bzr

revision field

  A tag name, branch name, revision id.

resolved_commit_id field

  A revision id.

Subversion
++++++++++

Home page

  https://subversion.apache.org/

vcs command

  svn

vcs field

  svn

revision field

  ``revision`` must be compatible with ``svn checkout`` ``--revision`` option.
  In Subversion, branch or tag is part of ``url``.

resolved_commit_id

  Since Subversion does not support globally unique identifiers,
  this field is the Subversion revision number in the corresponding
  repository.

Examples
========

Example direct_url.json
-----------------------

Source archive:

.. code::

    {
        "url": "https://github.com/pypa/pip/archive/1.3.1.zip",
        "hash": "sha256=2dc6b5a470a1bde68946f263f1af1515a2574a150a30d6ce02c6ff742fcc0db8"
    }

Git URL with tag and commit-hash:

.. code::

    {
        "url": "https://github.com/pypa/pip.git",
        "vcs": "git",
        "revision": "1.3.1",
        "resolved_commit_id": "7921be1537eac1e97bc40179a57f0349c2aee67d"
    }

Example pip commands and their effect on direct_url.json
--------------------------------------------------------

Commands that generate a ``direct_url.json``:

* pip install https://example.com/app-1.0.tgz
* pip install https://example.com/app-1.0.whl
* pip install "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"
* pip install ./app
* pip install file:///home/user/app

Commands that *do not* generate a ``direct_url.json``

* pip install app
* pip install app --no-index --find-links https://example.com/
* pip install --editable "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"
* pip install -e ./app

Use cases
=========

"Freezing" an environment

  Tools, such as ``pip freeze``, which generate requirements from the Database
  of Installed Python Distributions SHOULD exploit ``direct_url.json``
  if it is present, and give it priority over the Version metadata in order
  to generate a higher fidelity output. In presence of a ``vcs`` direct URL,
  The ``resolved_commit_id`` field SHOULD be used in priority in order to provide
  the highest possible fidelity to the originally installed version. If
  supported by their requirement format (such as `PEP440 Direct References`_),
  tools are encouraged to output both ``revision``and ``resolved_commit_id``.
  Tools MAY choose another approach, depending on the needs of their users.

Backwards Compatibility
=======================

Since this PEP specifies a new file in the ``.dist-info`` directory,
there are no backwards compatibility implications.

Alternatives
============

PEP426 source_url
-----------------

The now withdrawn PEP 426 specifies a ``source_url`` metadata entry.
It is also implemented in `distlib`_.

It was intended for a slightly different purpose, for use in sdists.

This format lacks support for the ``subdirectory`` option of pip requirement
URLs. The same limitation is present in PEP440 direct references.

It also lacks explicit support for `environment variables in the user:password
part of URLs`_.

The introduction of a key/value extensibility mechanism and support
for environment variables for user:password in PEP440, would be necessary
for use in this PEP.

revision vs ref
---------------

The ``revision`` key was retained over ``ref`` as it is a more generic term
across various VCS and ``ref`` has a specific meaning for ``git``.


References
==========

.. _`pip issue #609`: https://github.com/pypa/pip/issues/609
.. _`thread on discuss.python.org`:  https://discuss.python.org/t/pip-freeze-vcs-urls-and-pep-517-feat-editable-installs/1473
.. _PEP440: http://www.python.org/dev/peps/pep-0440
.. _`VCS URLs supported by pip`: https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support
.. _`PEP440 Direct References`: https://www.python.org/dev/peps/pep-0440/#direct-references
.. _`Pipfile and Pipfile.lock`: https://github.com/pypa/pipfile
.. _distlib: https://distlib.readthedocs.io
.. _`environment variables in the user:password part of URLs`: https://pip.pypa.io/en/stable/reference/pip_install/#id10

Copyright
=========

This document has been placed in the public domain.


..
  Local Variables:
  mode: indented-text
  indent-tabs-mode: nil
  sentence-end-double-space: t
  fill-column: 70
  coding: utf-8
  End:
Packaging Mini Summit (PyCon US 2019): Topic Suggestions
(Brett Cannon) #2

FYI PEP numbers are automatically hyperlinked, so you can just write “PEP 376” to make them a bit more readable.

i.e.

Would be nice to hyperlink this.

“so-called”

Does this mean the file should be specified when downloading the wheel from PyPI? It might be worth mentioning that the drawback of doing that is you can’t grab the package from other indexes when necessary or in a different format (e.g. wheel over zip when made available later).

(Stéphane Bidoul) #3

Not at all. This spec does not change the behavior of installers with regard to indexes. source_url gets recorded only when the original requirement was a URL, not when it was a regular name + version specifier.

(Nick Coghlan) #4

I don’t think that follows, as the new metadata file itself merely provides information, it doesn’t specify what tools are expected to do based on that information.

I think the more practical constraint is that in a lot of “name and version” cases, there won’t be a URL readily available in the installer, since it will have picked up the artifact from a local cache directory instead. Since we don’t need the extra metadata in those cases, and it may be hard for installers to provide it, it makes sense to restrict the new metadata to the “direct URL” case.

So the only change I’d personally suggest is to call the new metadata file direct_url.json. source_url is ambiguous as to whether “source” is being used in the “origin of a download operation” sense or the “source distribution” sense, so it seems best avoided (the PEP 426 usage was intended for use in sdists, which made it less ambigous in that regard).

2 Likes
(Stéphane Bidoul) #5

Agreed, this makes sense. I’ll also attempt to come up with a better PEP title to reflect that.

(Stéphane Bidoul) #6

I changed the title to “Recording the origin of distributions installed from direct URL references”, which should reflect the content more accurately.

(Stéphane Bidoul) #7

I have integrated all remarks so far at https://github.com/sbidoul/peps/blob/source_url-sbi/pep-9999.rst .

(Chris Jerdonek) #8

I haven’t had time to provide detailed feedback on this proposal. But rather than wait longer, I’ll provide some feedback now – though not very detailed.

Since each VCS has its own options and peculiarities and ways of specifying and doing things, I think that – except for things like url and subdirectory that apply obviously to all VCS’s – more specific fields should be defined separately for each VCS.

For Git, instead of ref which should normally only be a name that starts with “refs/”, I think it would make sense to have a field called something like revision or requested-revision for the requested revision / “commit-ish” / “committish” (example values include things like “master”, “1.0”, “refs/pull/6447/head”, a full or partial sha, etc.), and a separate field called something like resolved-ref for what ref the requested revision resolved to, if any (e.g. “refs/heads/master”, “refs/tags/1.0”, “refs/pull/6447/head”, etc.). The latter tells you in particular whether the requested revision was a branch or tag or some other type of ref, based on whether it begins with “refs/heads/” or “refs/tags/”. pip already does this resolution. I haven’t thought through whether it would make sense, but maybe the spec should also spell out what happens if the reference is ambiguous (i.e. if the revision corresponds to more than one ref), and also be able to indicate if the requested revision doesn’t correspond to any ref.

To provide further evidence that VCS-specific fields make sense, a future possible field (but not for this PEP) that applies to Git but wouldn’t to other VCS’s is a list of what submodules were fetched prior to installing: https://github.com/pypa/pip/issues/6374

I also feel like commit-hash should be renamed to something like commit-id, and then say separately for each VCS what that field means / what value the field should take on. To me it seems like the more important and useful property of this field is that it be the way to unambiguously identify a commit for that VCS. This would let us support things like central SVN repositories where a revision number is sufficient to unambiguously identify a commit, and that’s the best you can do for that VCS (as there is no notion of a commit hash).

By the way, it looks like the phrase “Draft PEP” / “Proposed PEP” which I suggested before was dropped from the title of this thread.

(Paul Moore) #9

No time to do anything like a proper review, but I wanted to comment on the following point:

Rather than referring to the capabilities of existing installers, it would be better to link this proposal back to existing standards.

In this case the relevant standard is PEP 440, which you mention in terms of the direct reference spec, but which also makes the comments:

Distributions are identified by a public version identifier which supports all defined version comparison operations

(in “Version Scheme”) and

Some automated tools may permit the use of a direct reference as an alternative to a normal version specifier. A direct reference consists of the specifier @ and an explicit URL.

(in “Direct References”).

Combining these two, which essentially define exactly how a user can specify a project to an installation tool it makes sense to me to describe the context of this proposal without referring to specific tools like pip or their features, something along the lines of:

Following PEP 440, a distribution can be identified by a name and either a version, or a direct reference (add appropriate links or explanations here). The name and version are captured in the project metadata, but currently there is no way to obtain details of the URL used when the distribution was identified by a direct reference. This proposal defines additional metadata, to be added to the installed distribution by the installation front end, which records the direct reference URL for use by consumers which introspect the database of installed packages (see PEP 376).

With this framing, pip freeze simply becomes the motivating example of a consumer that needs that information stored in the database of installed packages.

(Stéphane Bidoul) #10

I’m fine with the wording you propose which is indeed a nice way to frame the proposal.

Would you prefer to see your paragraph in the abstract, and keep the motivation section using examples as it is now? Or rather shrink the motivation section to remove or tone down references to capabilities of existing installers and the freeze use case?

(Stéphane Bidoul) #11

ref is indeed meant to encode the reference/revision that was requested, while commit-hash is meant to be the resolved ref. I’ll see to clarify the wording in that respect (see proposal below).

The use of ref was inspired mainly

  • by PEP 440 direct references (and pip URL references) which do not distinguish between tags and branches in their URL format
  • Ruby Bundler which uses ref
  • Pipfile

So it is sufficient to encode all types of direct references known today.

About submodules, this can indeed be covered later by future specs which can extend direct_url.json.

I would not rename ref to requested-revision because, like url, it is implied by the spec that it is the requested one.

The spec does not mandate any specific use of the new metadata so I suppose this part can be left out of this PEP?

commit-id is indeed more generic and covers SVN.

Tools would then need to combine the knowledge of VCS type plus the presence of commit-id to decide if the reference refers to an immutable version of the code, not just the presence of commit-hash. That is fine with me.

So I could to update the spec to replace commit-hash by resolved-commit-id and say that VCS that support hash based commit references MUST use it in that field.

Assuming the updated text above (resolved-commit-id), do you have specific use cases in mind for an additional resolved-ref field?

(Stéphane Bidoul) #12

@cjerdonek I updated the Specification section to talk about resolved-commit-id. Let me know your thoughts.

@pf_moore I used your text in the abstract. So far I left the Motivation section untouched, as concrete examples known today. Let me know if you think this section needs updating too (or shrinking as it may be too obvious?).

Enjoy pycon.

(Stéphane Bidoul) #13

Hi,

I would like to progress with implementation of pip#609.

Are there any additional comments on this specification, or suggestions on the best way to move forward?

(Paul Moore) #14

You need a sponsor to get this registered as an actual PEP, and then it needs to be reviewed, and ultimately signed off by an appropriate BDFL-Delegate (which would likely be me, as this is a packaging interoperability PEP).

You should keep the PEP text in the initial post here up to date with the master version - at the moment it seems like it’s drifted. And ideally, a formatted version posted here would be easier to read.

At the moment, you’ve had a couple of comments on the proposal, and nothing particularly negative, but there’s been no really positive support, either, so I’d question how useful people actually find this. The pip issue has been round since 2012, so it’s not exactly a showstopper…

Also, I’m a little unclear as to how much this will “lock in” VCS-specific URL formats into the interop specs - pip uses its own notation (git+https://...., plus #egg= fragments) and I’ve no idea how standard these are or whether other tools could want to use a different approach.

So IMO, there’s still a chunk of process that needs to be followed, as well as some outstanding discussion that needs to be had, before this is ready to be accepted.

(Stéphane Bidoul) #15

I think one reason it has not been implemented yet is that people can work around it by bending editable installs backwards so pip freeze works reliably with vcs urls, where they might not need --editable in their requirement files if pip#609 was implemented.

The other reason is implementing it requires updating the database of installed distributions, which in turn requires a PEP, which is a difficult process (especially for someone like me who is not that well connected in the python community to find sponsors, and just trying to contribute as upstream as possible in the hope to be useful to the widest community)… Comparatively, my previous contributions to pypa (pip, setuptools_scm) were a breeze :slight_smile:

In the Specification section, I’ve been particularly careful to not introduce such lock in. I’ve examined PEP 440 and pipfile formats in addition to pip’s native format and made sure the json spec is generic and can be used to generate them all. The git+https://... format is not part of the spec, as I split the vcs type and url in different fields.

I’ll update the original post. My initial reasoning for not doing it was to keep the conversation consistent with the original post. [edit: done]

I think the only python core dev (besides you) who commented on the spec so far are @brettcannon and @ncoghlan. Apologies for the mention, Nick, since you seemed rather positive on the draft, would you accept to sponsor it? Or could another pip maintainer sponsor it given the narrow scope, despite not being a python core dev? Maybe @pradyunsg since you commented on pip#609 that it was still a valid issue?

(Chris Jerdonek) #16

I haven’t had time to reply to your response to my comments as you didn’t incorporate my suggestions. I would like to, and I will try to do so soon.

I’m also a core dev who commented on the draft. I’m also the pip maintainer who has been doing the most work on the area of the code affected by this PEP. I would even say it’s been the focus of my pip work. I’ve made probably dozens of commits – fixing bugs, features, adding tests, reviewing patches, and refactoring, which is still continuing.

(Stéphane Bidoul) #17

Ooops, sincere apologies, I don’t how I did not notice you are a core dev.
So you are indeed a good candidate to sponsor this if you think it’s a good idea.

I did incorporate part of your suggestions in this commit.
I commented on the rest in post #11.

Thanks for your time on this matter, looking forward to reading your further comments.

(Chris Jerdonek) #18

I’m going to start out by replying only to the part about choosing a name different from ref – in part because of time, but also because this is a long message. Other portions like adding resolved-ref I will do in a separate message.

My original comment was more about asking not to use the word “ref” rather than to add the prefix “requested”. requested-revision was just one of the first alternatives that came to mind; revision was the other:

The reason not to use “ref” is that “ref” is a term that has a specific, different meaning in Git, and I think there are plenty of alternatives to choose from that don’t have this problem.

In Git, a ref is a string that begins with the prefix “refs/” and is a particular type of reference. From Git’s glossary:

ref - A name that begins with refs/ (e.g. refs/heads/master) that points to an object name or another ref (the latter is called a symbolic ref).

In Git at least, the most appropriate word for the concept being referred to by this PEP would be “commit-ish” or “committish.” Here is a link to Git’s glossary entry. Git’s documentation also tends to use “revision” and <rev> when speaking about ways to refer to a particular commit. For example, Git’s gitrevisions documentation has a section on “Specifying Revisions”.

Also, pip’s VCS code uses the variable name rev (for revision). This is why I think something like revision (or requested-revision, to distinguish from an actual revision) would be a lot better, and I don’t see any downside. committish would also be fine I think.

pip supports installing any committish, and not just revisions that resolve to refs.

As a baseline, here is an example of pip-installing from a branch (branch “azure-pipelines”, which resolves to ref “refs/heads/azure-pipelines”):

$ pip install git+https://github.com/python-attrs/attrs.git@azure-pipelines#egg=attrs
Collecting attrs from git+https://github.com/python-attrs/attrs.git@azure-pipelines#egg=attrs
  Cloning https://github.com/python-attrs/attrs.git (to revision azure-pipelines) to /private/var/folders/q9/j0_5hxt88v5592006s6dd3n80000gs/T/pip-install-q5aqzlw7/attrs
  Running command git clone -q https://github.com/python-attrs/attrs.git /private/var/folders/q9/j0_5hxt88v5592006s6dd3n80000gs/T/pip-install-q5aqzlw7/attrs
  Running command git checkout -b azure-pipelines --track origin/azure-pipelines
  Switched to a new branch 'azure-pipelines'
...
Successfully installed attrs-19.2.0.dev0

Here is installing from a tag (tag “17.2.0”, which resolves to ref “refs/tags/17.2.0”):

$ pip install git+https://github.com/python-attrs/attrs.git@17.2.0#egg=attrs
Collecting attrs from git+https://github.com/python-attrs/attrs.git@17.2.0#egg=attrs
  Cloning https://github.com/python-attrs/attrs.git (to revision 17.2.0) to /private/var/folders/q9/j0_5hxt88v5592006s6dd3n80000gs/T/pip-install-wcf_4dct/attrs
  Running command git clone -q https://github.com/python-attrs/attrs.git /private/var/folders/q9/j0_5hxt88v5592006s6dd3n80000gs/T/pip-install-wcf_4dct/attrs
  Running command git checkout -q 8f95ef7467f33efd3f7f0193a6b9bf714195eaf6
...
Successfully installed attrs-17.2.0

Here is installing from the committish “17.2.0~2”, which means two commits before the above tag and is not a ref and does not correspond to a ref (notice the log message, “WARNING: Did not find branch or tag ‘d0806d9d2fa’, assuming revision or ref”):

$ pip install "git+https://github.com/python-attrs/attrs.git@17.2.0~2#egg=attrs"
Collecting attrs from git+https://github.com/python-attrs/attrs.git@17.2.0~2#egg=attrs
  Cloning https://github.com/python-attrs/attrs.git (to revision 17.2.0~2) to /private/var/folders/q9/j0_5hxt88v5592006s6dd3n80000gs/T/pip-install-259feqcb/attrs
  Running command git clone -q https://github.com/python-attrs/attrs.git /private/var/folders/q9/j0_5hxt88v5592006s6dd3n80000gs/T/pip-install-259feqcb/attrs
  WARNING: Did not find branch or tag '17.2.0~2', assuming revision or ref.
  Running command git checkout -q '17.2.0~2'
...
Successfully installed attrs-17.2.0.dev0

Finally, here is an example of installing from the committish “d0806d9d2fa”, which is an abbreviated sha and is not a ref and does not correspond to a ref:

$ pip install git+https://github.com/python-attrs/attrs.git@d0806d9d2fa#egg=attrs
Collecting attrs from git+https://github.com/python-attrs/attrs.git@d0806d9d2fa#egg=attrs
  Cloning https://github.com/python-attrs/attrs.git (to revision d0806d9d2fa) to /private/var/folders/q9/j0_5hxt88v5592006s6dd3n80000gs/T/pip-install-lmge3z1d/attrs
  Running command git clone -q https://github.com/python-attrs/attrs.git /private/var/folders/q9/j0_5hxt88v5592006s6dd3n80000gs/T/pip-install-lmge3z1d/attrs
  WARNING: Did not find branch or tag 'd0806d9d2fa', assuming revision or ref.
  Running command git checkout -q d0806d9d2fa
...
Successfully installed attrs-19.2.0.dev0

pip can also even install from an actual ref.

Regarding Pipfile, I talked with @techalchemy about this at PyCon, and he said that if he could do it over again, he wouldn’t use the word “ref” for the same reasons I mentioned above – that these aren’t refs but rather something more general including things like branch names, tags, commit SHA’s, abbreviated commit SHA’s, etc.

PS - I’m open to considering sponsoring as I support the idea in concept. I just need to review what that entails first because as I recall that could impose some restrictions on my involvement later.

(Stéphane Bidoul) #19

@cjerdonek thanks for the detailed explanation. revision is fine with me. I updated the draft.

TIL pip supports installing from git refs. That’s a great improvement (is it recent?), very useful to install a git pull request.

(Chris Jerdonek) #20

Great, thanks!

It’s from a couple years ago and was first added in pip 10.0 (so somewhat recent but not too recent): https://github.com/pypa/pip/pull/4429
And indeed, the motivation was for checking out PR’s more easily. I think this feature might not be documented, which is one reason more people might not know about it. (It would be good to document this.)