Packaging console scripts with interpreter options

Hi,

This originally started in Packaging Python script with UTF-8 mode - #7 by jeanas.

It would be great if there were a way to distribute CLI applications on PyPI that can use it regardless of what PYTHONUTF8 is set to in the user’s environment. Think of this as from __future__ import utf_8_mode, except that since UTF-8 mode can’t be turned on after launching Python, it has to be done on the CLI invocation (or from the environment). This would allow application developers to use UTF-8 mode before it becomes the general default.

I didn’t find any user-oriented packaging documentation showing how to do this. I suppose it’s not supported, but then, being a bit lost with the rapid pace of change in the packaging ecosystem, I’m not quite sure which issue trackers to raise this one on – pip, distlib, wheel, packaging, build, setuptools/flit/hatch/poetry, other??

I’m a bit confused as to whether pip wheel install scripts not respecting parameters after `#!python` · Issue #10661 · pypa/pip · GitHub is the same feature request – this is asking to respect an existing shebang, but I am using the [projects.scripts] table in pyproject.toml, where the format is module:main_function, and since the generated script calls main_function itself, I wouldn’t expect that to respect any shebang in the source file.

I think it’s a separate (although related) issue.

It’s difficult to know what the best answer is here. Adding metadata to the entry point in [projects.scripts] has the problem that you can’t know what version of Python the user will be installing your project into, and interpreter command line flags can change in different versions. Also, the entry point spec doesn’t allow for metadata like this, so it would need a new version of that spec, and that’s a non-trivial undertaking.

In many ways, I can see an argument that if you need this sort of control over how Python is invoked, you shouldn’t be using a simplistic feature like console scripts, but should be using a more complete and configurable “Python application bundler”. Unfortunately, console scripts are “good enough” for so many use cases that there’s been little interest in anything targetted at that middle ground. The other extreme is full bundlers like PyInstaller, which embed the whole Python interpreter and all your dependencies, but that’s overkill for a case like this.

I guess your application entry point could do something like re-invoke Python with the appropriate command line flags. That’s pretty clumsy, but would probably work as a practical way round the issue.

2 Likes

This is definitely something I’ve wanted as well, and we talked about it in relationship to the [1] rejected PEP 690 Lazy Imports proposal. As Paul says, it’s complicated, but I do think that the entry point spec is the right place to try to make progress here, though I’m not sure what is the path forward.


  1. sadly ↩︎

If I understood correctly, the GitHub ticket is meant for this usage (not the entry points):

Console entry points are just a short-hand equivalent to a script like:

#!python
# -*- coding: utf-8 -*-
import re
import sys
from ENTRYPOINT_MODULE import ENTRYPOINT_VARIABLE
if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(ENTRYPOINT_VARIABLE())

The wheel installer has some flexibility in the exact text of the script it synthesizes – the above is what pip currently uses – but the semantics are standard and you could include the fully-written out script in your wheel instead if you wanted.

It looks like PEP 427’s text is actually a bit underspecified for exactly how #!python lines are supposed to be handled – you could argue that writing #!python -Xutf8 is already spec-compliant. Though this currently a moot point, since no wheel installers support it currently.

But anyway, if you want the shortest route to getting some kind of support for this, then it might be to get support for script shebang arguments into wheel installers and then ship an explicit script in your wheels. This would at least let you skip having to redesign entry points.

[Edit: okay from here I just learned that pip doesn’t generate .exe wrappers for scripts with #!python? I guess you’d have to fix this too. PEP 427 explicitly recommends that installers generate .exe wrappers for these scripts so this is arguably just a bug in pip.]

My comment that you linked to is wrong, I didn’t appear to know at the time (or I simply made a mistake) that pip generates exe wrappers for files in the scripts section of the wheel. There’s a relevant pip issue here which links to a PR here. That stalled because the conclusion was that the wheel spec needs clarifying, and once it is, the PR can be updated to match. But no-one has done anything about getting the wheel spec updated, so I can only assume no-one cares enough about old-style scripts (I don’t even know how many modern backends even support including them in a wheel).

I don’t know whether it’s practical to bring support for script files up to a level that’s useful (for this issue in particular, or in general as an alternative to console script entry points) but it won’t happen unless someone champions the necessary changes.

Naive question(s)…

Is it something that should be possible? Is it acceptable that the author of the entry point decides what interpreter options are set or not? Shouldn’t it be left to the one doing the installation or the one launching the entry point?

For most of them, it definitely is. These are application-level options, and the console script entry point is the start of the application, so it’s fine. (Though personally, I would rather see pyz files be distributed as apps through some yet-to-be-popularised tool than encourage people to rely on wheels and console scripts.)

But if there’s an option that may be controversial, the developer gets to make that choice. If their users don’t like it, they can go to them and ask for a change or an option.

1 Like

Thank you all for the replies. At this point, while I can mostly understand what you’re talking about, commenting meaningfully is beyond my knowledge level, so I will let you all figure out what should come out of this discussion :slight_smile:

Rather off-topic (at this point of the discussion), but the POSIX spec is also vastly underspecified around a shabang line with spaces, and #!python -Xutf8 is implemented as either run command python with argument -Xutf8 and run command python -Xutf8 depending on the system you are on. This can be overcomed (you can always just wrap the call in a shell script), of course, I just can’t help myself point out you can never underestimate how untrotted this path is.

1 Like

I suggested this previously for optimization (-O, -OO) options (on setuptools issue tracker which wasn’t really the right place), and then again during the lazy imports PEP discussion. I think it would also be useful for the safe path option (-P) that was added in 3.11 if it’s not already how entry points behave (I have never tested this).

IIRC passing an executable + single argument like #!/path/to/python -Xuf8 is portable; the issue is that on some systems, if there are multiple arguments after the executable then they get glommed together into one argument, so e.g. #!/path/to/python -Xutf8 -Wsome:filter runs the command #!/path/to/python with a single argument which is -Xutf8 -Wsome:filter.

I’m fairly sure there’s at least one system that doesn’t support anything other than a bare executable, and which treats #!/path/to/python -Xuf8 as requesting use of the executable "/path/to/python -Xuf8" (with a space in the filename). It might be pretty obscure, or even obsolete by now. You can get a long way these days by ignoring the outliers. But the point is that there’s no standard we can defer to, so anything that we define will have to come with a bunch of “only works if your system’s shebang implementation supports the following features” qualifications.

I doubt we’ll get issues on any system where we can’t already say “sorry, Python isn’t supported on your system”. But conversely, I bet we will get the occasional bug report where we’ll need to take that stance…

Ultimately, practicality will beat purity. If we can design a useful feature around a level of shebang handling that most platforms (and all supported platforms) have, we probably should. But if we can find a solution that doesn’t rely on undocumented, inconsistently behaving platform features, that’s better.

I suppose an example is Ansible. It recently grew its own bespoke
shebang “parser” which it uses to try to guess what Python
interpreter to use when calling a module, but it entirely discards
the arguments, making it no longer convenient to have Ansible
modules which double as directly executable Python scripts in many
cases.

This could be a good resource: