About .exe wrappers created by frontends when installing wheels on Windows?

paugier · January 2, 2025, 2:17pm

I’m working on improving Mercurial packaging and use modern Python packaging methods.

Currently, we have an issue on Windows with pipx and UV because Mercurial uses a script (hg · branch/default · mercurial / mercurial-devel · GitLab) and no [project.scripts] and that scripts are not well supported.

Moreover, there is in Mercurial setup.py some code to build a .exe wrapper from a file exewrapper.c. IIUC, the .exe wrapper should not be build when the wheel is created (by the pep 517 backend) but at install time (by the frontend). Is it correct? If it is correct, we should not have special exewrapper.c and let pip, pipx or uv do that for us.

However, there are two things which seem to be important for Mercurial and that I don’t know how to do with a console_scripts entry point:

PYTHONLEGACYWINDOWSSTDIO=1 has to be activated (mercurial/exewrapper.c · branch/default · mercurial / mercurial-devel · GitLab). I don’t know if we could do that from Python in the function entrypoint.
hg.exe contains a Windows manifest (setup.py · branch/default · mercurial / mercurial-devel · GitLab) and I don’t see how to have that with a console-script entry point.

However, I have to admit that I don’t know anything about Windows manifest. The first point (PYTHONLEGACYWINDOWSSTDIO) seems more critical.

steve.dower · January 2, 2025, 6:33pm

There should be nothing wrong with building your own entry point as opposed to using console_scripts - it’s really just a convenience that it exists, not a requirement. Apps with more advanced needs such as yours deserve to have their own executable, and distributing them in the project.1.0.data/scripts directory should be supported.

I’m not 100% clear what warning you’re getting, or what UV and pipx are choosing not to do, but if you’re bundling an executable that does the right thing in the right part of your wheel then I’d push the issue back to them.

As for those two things you need, the manifest is particular is going to ensure that long paths (>260 chars) are enabled (if the OS has it enabled, which I still hope will be the default one day). I suspect that’ll be pretty important to you, and there’s no way to do it other than the manifest.

The legacy stdio can be emulated closely enough by overriding sys.std* at startup, something like this (please validate thoroughly, you’ll have better coverage than I do). I’m 99% sure there’s nothing else that flag does, though it’s been a long while since I thought about it. Note that mbcs is only a best guess - I’m not sure we have an API anymore that’s reliably defined as “get the console encoding”, since the idea is to use the Unicode APIs.

import io
import sys
sys.stdin = io.TextIOWrapper(io.FileIO(sys.stdin.fileno(), closefd=False), encoding="mbcs")
sys.stdout = io.TextIOWrapper(io.FileIO(sys.stdout.fileno(), "wb", closefd=False), encoding="mbcs", errors="replace")
sys.stderr = io.TextIOWrapper(io.FileIO(sys.stderr.fileno(), "wb", closefd=False), encoding="mbcs", errors="replace")
# Don't replace sys.__std*__, or if you do, stash them somewhere so they don't deallocate and close the streams

paugier · January 2, 2025, 9:44pm

Thanks a lot for your very useful answer.

I quickly checked the Python code for legacy stdio and it seems to work fine!

Does this mean that every Python application that need “long path” (>260 characters) on Windows has to provide a .exe with a manifest? So that tools like Black do not support such paths? It seems a bit surprising.

There is no way to tell tools like pip/pipx/UV to include a manifest in the .exe files created?

pf_moore · January 2, 2025, 10:32pm

The manifest seems like it’s generic, and if so there’s no reason it couldn’t be added to the standard wrapper exe. For pip/pipx, the wrappers come from distlib, so that’s where the change would need to be made. I don’t know where uv gets its wrappers from.

As to why it’s not there already, probably just no-one needed it until now.

steve.dower · January 2, 2025, 11:01pm

If they’re loading Python’s DLL directly, yes. If they’re just finding and launching the regular python.exe then they’ll only need it if Python itself is installed in a long path (otherwise it can’t find it to launch it), but that’s pretty uncommon.

Yeah, there shouldn’t be any harm. I do recall looking into some strange issue a month or so ago that turned out to be caused by the presence of a manifest, but I can’t think of the details right now. I’m sure adding it into distlib would find out pretty quickly if that’s a common issue.

pf_moore · January 2, 2025, 11:38pm

… although, as you point out above this, it’s not needed because the standard wrapper runs python.exe as a subprocess, which already has the necessary manifest. My mistake, I’d missed this detail.

paugier · January 3, 2025, 9:26am

Thanks a lot. All this is very interesting for me, who discover Windows and .exe wrappers.

To summarize the answers:

Having our own entry point is fine but it might not be necessary since

The effect of PYTHONLEGACYWINDOWSSTDIO can be obtained from Python
long paths should be activated when using console-script entry point since the general wrapper .exe created by distlib (so pip and pipx) launches python.exe in a subprocess, and python.exe has the correct manifest.

We will be able to try that for Mercurial.

Side note

I still don’t understand how these .exe wrappers work on Windows and how they are used by Python. I see that they can be created without any compiler installed, which is counter intuitive for me. I don’t understand if they contain the absolute path towards the Python executable to be used. Can they just be copied in another location? Is the .exe wrapper created by pipx in $HOME/.local/bin just a copy of the .exe wrapper in the venv?

Can we somehow read the information in a .exe wrapper to know what is called (similar to head $(which black) on Unix)?

It seems to me that venv also uses a .exe wrapper for python.exe in a Python environment (the two files do not have the same size).

I’d like to understand these things in particular because of a performance issue I encounter on a Windows computer where I observe a delay of approximately 0.5 second before starting applications with .exe wrappers (see Slow startup on Windows for virtual environments - #17 by paugier).

steve.dower · January 3, 2025, 11:06am

The distlib ones are modified on install to include a script file at the end, which the executable knows how to find and launch. This includes the absolute path to the venv’s python.exe.

The venv’s own python.exe isn’t modified, but knows how to locate a nearby pyvenv.cfg and use that to launch the correct python.exe.

Process launch is unfortunately very expensive on Windows, and can be made drastically (5-10x) worse by certain antivirus software. We avoid the worst of these in Python by not modifying the executables, but because distlib does modify theirs it’s likely that they’ll trigger a deep virus scan each time they’re launched (the various exe packers usually have the same issue).

With a properly embedded Python, you can avoid all of these for your app (especially if you include enough native implementation for the critical part of the fastest path). But if you’re relying on users installing your app into their own runtime, there’s really no option but to launch a couple of executables.

pf_moore · January 3, 2025, 11:28am

The “wrapper” is a precompiled binary shipped with distlib. The source code is here. When an entry point executable is created, it is made up of the wrapper, then a shebang line, then a zipfile containing Python code as a __main__.py file. The wrapper searches its own executable for the shebang, and launches the Python interpreter in the shebang with the executable as its argument. Python can run zipfiles, and zipfiles can have arbitrary data prepended, so the end result is that Python runs the code in the zipfile. That code imports the entry point function defined by the metadata, and runs it.

Yes, it’s the absolute path. That means that by design, entry point executables can be copied without moving the venv that contains the application code. And yes, that’s what pipx does to put the executables in ~/.local/bin. (On my PC, it actually uses symlinks rather than copies, but the effect is the same).

Sort of, yes. You can search for the shebang and read it, but it’s not at the start and the file is binary, so it’s a bit fiddly. The following should work, but it could do with some tidying up:

import sys

with open(sys.argv[1], "rb") as f:
    data = f.read()

_, sep, shebang = data.rpartition(b'#!')
path, sep, _ = shebang.partition(b'\n')

print(path.decode("utf-8"))

The venv redirector (as you say, it’s not the same as a wrapper) is very similar in how it works, but does a different job. The details aren’t documented (and so subject to change) but basically it launches the “base” Python interpreter in a way that tells it it’s a venv.

Starting multiple processes could be a performance issue - it’s certainly not without cost. But I’m not aware of any other way of launching a Python script that doesn’t have worse downsides (i.e., they don’t work everywhere that an exe does).

I don’t think anyone has done any work on ensuring the entry point wrapper has minimal overhead (although it’s pretty trivial, so there’s probably not much performance to be gained). It’s generally not been seen as a problem in practice.

pf_moore · January 3, 2025, 11:34am

(Steve beat me to a better response on performance, but I’ll add to this particular comment)

For an application, this is indeed the best way to get high performance and a seamless experience. However, it’s not something that the packaging ecosystem supports directly, as packaging tools are focused on building and installing libraries, not applications.

pf_moore · January 3, 2025, 12:09pm

On reflection, I guess your exewrapper.c code is intended as that sort of “direct execution” approach, but rather than embed your own copy of Python, you hunt out the Python DLL for the environment Python is installed in. That’s another approach, but in some ways it’s the worst of both worlds. You don’t get the tool support that entry point scripts give you, but you still have to deal with the fragility of relying on a Python environment that’s not in your control.

paugier · January 3, 2025, 1:48pm

There is a trend towards tools to install Python applications from wheels (pipx, UV tools, Pixi global, …). On Linux, the UX starts to be very good. I was hoping that it could be the same on Windows, which would simplify things for maintainers.

Even on Windows, it starts to be a nice alternative which “works” as expected. The only issue so far is startup delay related to a security software checking the applications at each call (which does not seem to be reasonable).

I also observe this startup delay with python.exe in virtual environment which is quite bad. (but the python venv is created from a conda installed Python so I need to check if it is different with a vanilla Python installed from the Microsoft Store).

It would be great to find a nice solution to this delay issue.

Currently, I only have access to a Windows computer that I can’t really control correctly. I can’t install anything from the Microsoft Store, nor install Visual Studio, nor temporary disable the antivirus. A quite common situation but not convenient to experiment!

I should be able to do more things on a Windows computer next week. So I’m going to come back to this subject to try to see if the situation cannot be improved from “upstream”.

IIUC, this is because of the modification of the wrapper that the virus scan is triggered each time it is launched. Would there be other alternatives, or a way to tell the antivirus “check me once really good and then consider that I’m fine”? Would it do the same if the wrapper was compiled locally?

steve.dower · January 3, 2025, 2:12pm

The best way to avoid the delay is to use a signed executable (with a trusted certificate). Of course, once it’s signed, modifying it will make the signature invalid which is even worse than not having one, but in general once an executable is signed the AV will trust it (and possibly cache it).

Compiling locally doesn’t help, unfortunately. There was one attempt to make this work (look up Mark of the Web or MOTW if you’re interested in the history), but it turns out that malicious code can simply copy itself and then be treated as if it were compiled locally There’s nothing in the OS that can reliably track the source, which is why code signatures are the best we have.

Yeah, some of us don’t like this trend that much There have been discussions about making .pyz files more easily redistributable, but that’s likely not going to be the solution either.

Ultimately, PyPI is popular because it’s the least amount of work for the developer (and the most amount of work for the user, in many cases^[1]). If you want to release a performant and native-feeling application for multiple platforms, you need to do work for each of those platforms. Just putting a set of binaries on someone else’s distribution platform is much faster.

Some tools are entirely appropriate to be distributed via PyPI. ↩︎

oscarbenjamin · January 4, 2025, 2:44pm

I can remember the long discussions about exactly why it had to be an .exe for the entry point wrappers. What I am wondering though is if it has to be a modified .exe file with shebang and zip literally inside the .exe.

In principle could it not be that the package installer puts the shebang and zip in a separate file alongside the .exe like Scripts/ruff.exe and Scripts/ruff.zip?

At runtime the .exe could use its own file path and name to locate the .zip file. The .exe code would be sort of like:

from pathlib import Path

exepath = Path(__file__)
zippath = exepath.parent / (exepath.stem + '.zip')

run_zip_file_with_shebang(zippath)

Then the actual .exe that is installed would always have the same contents and could be signed for AV. Or would it create just as much of an AV problem because AV would scan the zip file?

This kind of .exe dispatches on its path so would not be movable or at least if you wanted to copy it somewhere you would also have to copy the associated zip file.

steve.dower · January 4, 2025, 2:51pm

It used to look for an adjacent script file and launch that - no ZIP required. I don’t remember why it was changed, probably either to make it easier to move them around (no script file to lose) or to prevent people modifying the scripts and wondering why they broke.

AV will scan ZIP files and .py files, though they’re likely going to be faster (fewer signatures to compare) or skipped/deferred (e.g. on Dev Drive they’ll be delayed and you’ll be told later if it was malicious).

pf_moore · January 4, 2025, 3:24pm

It could, yes. That’s what the old setuptools wrappers did, - foo.exe with a foo-script.py alongside it. So it’s certainly possible. But (for whatever reason, I’m not 100% sure why) that approach wasn’t particularly popular, and the appended-zip approach replaced it.

In general, I think that trying to placate AV is a lost cause. And I’ll repeat what I said before - I’ve not seen reports of this sort of slowdown, so I’m not convinced it’s a general problem with AV, maybe it’s just the software @paugier is using that has this issue.

I’m pretty sure that was one of the problems with the setuptools-style solution.

steve.dower · January 4, 2025, 3:28pm

I’ve seen enough data to know that the problem exists and disproportionately impacts the demographics who don’t report issues to OSS projects. There’s also a general awareness that AV is the cause, and so people are less likely to report it (unless you happen to be the OS responsible for letting the AV interrupt your work ).

Annoyingly, the AV I work under (which I assume is fairly standard for Windows enterprise users) adds a solid 2-5 seconds delay for any new unsigned executable. It’s an incredible pain when you’re developing software, since each time you rebuild it’s a new unsigned executable, but I’m also never going to report it anywhere except my IT department (or sometimes if I happen to find the right engineer, but that’s more like “complaining with hope” than reporting!^[1])

Fun side story - I did this once in a VP review and accidentally kicked off a multi-year effort to improve things, which ended up leading to Dev Drives being added, as well as a number of OS bugs being fixed and Visual Studio’s installation getting significantly faster ↩︎

pf_moore · January 4, 2025, 3:41pm

Cool, thanks for the confirmation. I still think that trying to placate AV isn’t a productive use of open source maintainers’ time. From what you say, the only solution is to sign every executable, and as far as I know, the vast majority of OSS developers won’t have a signing key in the first place.

If someone wants to work on an alternative solution, then that’s obviously fine. But I doubt an “executable plus associated script file” solution would get much traction. I know I’d argue against pip adopting such an approach.

oscarbenjamin · January 4, 2025, 4:13pm

Probably. It does sound in this particular case like this is poorly implemented AV if it adds 0.5 seconds overhead every time the same executable is run. There should be some way to cache something so that at least a second run takes much less than 0.5 seconds.

I was imagining that since the executable doesn’t need to be modified it could always just be copied from somewhere. Then only one person would ever need to build that executable and there might be some particularly trust-worthy Microsoft and Python type person who could make the executable and sign it with their key.

pf_moore · January 4, 2025, 4:43pm

As I said, I think there are enough downsides to the “exe plus Python script” approach that it’s not going to work. If there was a way for a signed executable to re-sign itself when data was added to it, that might work - but I’ve no idea if that’s possible.

For me, the key features we need are:

A single executable that can be copied, symlinked, etc., and still work.
We don’t require every application developer publishing Python applications to own a signing key.

How do languages like rust and go handle this? They produce executables, and presumably the average open source rust/go project doesn’t sign those executables. Why is Python any different here?

The only difference I can see is that for a developer who has got a signing key and wants to sign their Python application, entry point scripts aren’t a good match because the executable is constructed at install time rather than at build time. In that case, properly embedding Python in the application, as Steve described here is probably a much better approach anyway.