I’ve just submitted PEP 722: Dependency specification for single-file scripts by pfmoore · Pull Request #3210 · python/peps · GitHub, to specify a format for including dependency specifications in single-file Python scripts.
This is basically a formalisation of (a subset of) the syntax already supported in
pipx code isn’t released yet), so it’s available in existing tools. By having it in a formal spec, hopefully other tools can rely on and work with this data as well.
See the PEP for details, but it’s intended to be for single-file scripts only, and is not intended in any way to replace or compete with project metadata stored in
pyproject.toml. Basically, if you have a
pyproject.toml, or can create one, you shouldn’t need this.
The rendered version is available here
Edit: The above was a preview link. The PEP is now published and is available from the normal PEP location
(The details below are for the original version, and are now out of date).
End of edit
Full PEP text
PEP: 722 Title: Dependency specification for single-file scripts Author: Paul Moore <email@example.com> PEP-Delegate: TBD Discussions-To: https://discuss.python.org/t/29905 Status: Draft Type: Standards Track Topic: Packaging Content-Type: text/x-rst Created: 19-Jul-2023 Post-History: `19-Jul-2023 <https://discuss.python.org/t/29905>`__ Abstract ======== This PEP specifies a format for including 3rd-party dependencies in a single-file Python script. Motivation ========== Nearly every non-trivial Python project will depend on one or more 3rd party libraries. These dependencies are typically recorded in the project metadata, in the file ``pyproject.toml``. This approach serves well for any code that is large enough to be stored in its own project directory, but it does not work well for single-file Python scripts (which are often kept in some sort of shared directory). This PEP offers a solution for that use case, by storing the dependencies in the script itself. This proposal is (a subset of) behaviour that already exists in the ``pipx`` and ``pip-run`` tools, so it is simply formalising existing behaviour, rather than defining a brand new capability. Rationale ========= In general, the Python packaging ecosystem is focused on "projects", which are structured around a directory containing your code, some metadata and any supporting files such as tests, automation, etc. However, beginners (and quite a few more experienced developers) will often *start* a project by just opening up an editor and writing a simple script. In some cases, the project might outgrow the "simple script" phase and get restructured into a more conventional project directory, but not all do, and even those that do still need to be usable in the "simple script" phase. These days, the idea that "simple scripts can just use the stdlib" is becoming less and less practical, as the quantity, quality and usability of libraries on PyPI steadily increases. For example, using the stdandard ``urllib`` library rather than something like ``requests`` or ``httpx`` makes correct code significantly harder to write. Having to consider "uses 3rd party libraries" as the break point for moving to a "full scale project" is impractical, so this PEP is designed to allow a project to use external libraries while still remaining as a simple, standalone script. Of course, *declaring* your dependencies isn't sufficient by itself. You need to install them (probably in some sort of virtual environment) so that they are available when the script runs. This PEP does not cover environment management, as tools like `pip-run <https://pypi.org/project/pip-run/>`__ and `pipx <https://pypi.org/project/pipx/>`__` already offer that ability. But by standardising the means of declaring dependencies, this PEP allows scripts to remain tool-independent. Specification ============= Any Python script may contain a *dependency block*, which is a specially structured comment block. The format of the dependency block is as follows: * A single comment line containing the (case sensitive) text "Requirements:" * A series of comment lines containing :pep:`508` requirement specifiers. * An empty comment or blank line. To be recognised as a "comment line", the line must start with a ``#`` symbol. Leading and trailing whitespace on the comment line is ignored. The dependency block may occur anywhere in the file. There MUST only be a single dependency block in the file - tools consuming dependency data MAY simply process the first dependency block found. This avoids the need for tools to process more data than is necessary. Stricter tools MAY, however, fail with an error if multiple dependency blocks are present. Example ------- The following is an example of a script with an embedded dependency block:: #!/usr/bin/env python # In order to run, this script needs the following 3rd party libraries # # Requirements: # requests # rich import requests from rich.pretty import pprint resp = requests.get("https://peps.python.org/api/peps.json") data = resp.json() pprint([(k, v["title"]) for k, v in data.items()][:10]) Backwards Compatibility ======================= As the dependency data is recorded in the form of a structured comment, this is compatible with any existing code. Security Implications ===================== If a script containing a dependency block is run using a tool that automatically installs dependencies, this could cause arbitrary code to be downloaded and installed in the user's environment. This is only possible if an untrusted script is run, and such a script can already cause arbitrary damage, so no new risk is introduced by this PEP. How to Teach This ================= The format is simple, and should be understandable by anyone who can write Python scripts. In order to add dependencies, a user needs to 1. Understand how to specify a dependency - they should already have encountered the format when installing their dependencies manually using a tool like pip. 2. Use a tool that recognises and processes dependency blocks. This PEP does not cover teaching users about such tools. It is assumed that if they are popular, users will find out about them as with any other library or tool. Note that the core Python interpreter does *not* interpret dependency blocks. This may be a point of confusion for beginners, who try to run ``python some_script.py`` and do not understand why it fails. It is considered the responsibility of the person sharing the script to include clear instructions on how to run it. Reference Implementation ======================== This format is already supported `in pipx <https://github.com/pypa/pipx/pull/916>`__ and in `pip-run <https://pypi.org/project/pip-run/>`__. Rejected Ideas ============== Why not include other metadata? ------------------------------- There is no obvious use case for other metadata, and if a project *does* need to specify anything more than some 3rd party dependencies, it has probably reached the point where it should be structured as a full-fledged project with a ``pyproject.toml`` file. What about version? ------------------- The one obvious exception is a script version number. The use cases for a version are, however, very different from those for dependencies, and it seems more reasonable to keep the two separate. There are already existing conventions for keeping a version number in a script (a ``__version__`` variable is a common approach) and these seem perfectly adequate. Why not make the dependencies visible at runtime? ------------------------------------------------- This would typically involve storing the dependencies as a (runtime) list variable with a conventional name, such as:: __requires__ = [ "requests", "click", ] This has a number of problems compared to the proposed solution. 1. The consumer has to parse arbitrary Python code, which almost certainly means using the stdlib AST module, making it much harder for non-Python code to read the data, as well as making Python code that does so significantly more complex. 2. Python syntax changes every version. While the requirement data only uses a simple subset, the full file still needs to be parsed to *find* the requirement data. 3. This would reserve a specific global name (``__requires__``) in the above, potentially clashing with user code. 4. Users could assume that the value can be manipulated at runtime, and would get unexpected results if they tried to do so. Furthermore, there is no known use case where being able to read the list of requirements at runtime is needed. It is worth noting, though, that the ``pip-run`` utility does implement (an extended form of) this approach. See `here <pip-run issue_>`_ for further discussion. Should scripts be able to specify a package index? -------------------------------------------------- The pip requirements file format allows a lot more flexibility than a simple list of requirements - it allows pip options, including specification of non-standard indexes. The requirements format is not standardised, though, and never will be in its current form, as it includes a lot of pip-specific functionality. This proposal deliberately does not try to replicate the full feature set of a requirements file. It would be possible to implement "some" features, for example being able to add extra index locations. However, it is difficult to know where to draw the line, and not all consumers of this data may be passing the dependencies to pip (for example, a script vulnerability scanner). If a script needs the full requirements file capabilities, it can be shipped with an accompanying requirements file. While this means the code can no longer be shipped as a single file, it has probably reached a point of complexity where "having everything in a single file" is no longer an appropriate goal anyway. There is more discussion of this point in `the previously mentioned pip-run issue <pip-run issue_>`_. What about local dependencies? ------------------------------ :pep:`508` does not allow local directories or files as dependecy specifiers. This is deliberate, as such forms are not portable, and the reasoning applies equally to single file Python scripts that are being shared. For purely local use, however, it *is* possible that a script might want to depend on a local library. While this specification does not allow this, it is not unreasonable for tools to loosen the specification to "anything that can be passed to pip as a requirement". In a practical sense, this is easier for tools to implement, as they can simply pass the requirements to pip and let pip do the validation. To be compliant to this standard (and hence tool-independent) only :pep:`508` requirements may be used, though. A standard cannot reasonably defer part of its specification to an implementation-defined rule, like "whatever pip supports". Why not use a more standard data format (e.g., TOML)? ----------------------------------------------------- Simplicity. There is nothing in a list of requirements that can't be expressed in the form of plain text, with one requirement per line. Using a more capable format adds complexity in parsing and a higher learning curve for users, with no gain. There are no obvious future enhancements to this format which might need a more complex format - as has already been noted, once a project gets complex, the next step is to transition to a ``pyproject.toml`` based structure, *not* to try to push the bounds of the single script format any further. Open Issues =========== None at this point. Footnotes ========= .. _pip-run issue: https://github.com/jaraco/pip-run/issues/44 Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.