Seriously though, pip (and Conda for the matter) is hadicapped by how low-level it operates, and it’s extremely difficult for it to handle various operations correctly, such as partially upgrading an environment, and auto-removing dependencies when uninstalling a package.
Thinking this the other way around, maybe the best approach a compatitor can take is to not care about dependencies at all, and advertise itself as a fast tool to populate a package into an environment. Tools like Pipenv and Poetry can use it instead of pip (since they already have dependency resolution anyway), which makes it easier to gain a user base and identify missing gaps.
If such tool would exist we would probably choose to use it in Nixpkgs to install wheels instead of pip. No need to handle deps, download anything. Just install in a designated directory.
By exactly the same, does it also replicates some of pip’s suboptimal behaviour? e.g. pip does not actually checks whether the wheel content matches RECORD as the spec mandates, IIRC.
Could be helpful to rip it out as a separate library, this seems more general purpose than a setuptools extension, as what I think the wheel package off.
A general purpose library to go from wheel -> installed package would be very useful and we have the name installer on PyPI, which would be pretty perfect for this library IMO.
I’m happy to do the work of moving this code / pip’s code into there, and wrapping it into proper reusable library and make pip start using that. Any pointers beyond the ones above to help me get started on this?
I wonder if it’s useful to make the API sans-IO. Probably not since most IO things are filesystem calls, which work synchronously anyway, but there could be some design tricks to make the API easier to work with in an async context.
A project called installer sounds to me it will be able to install almost anything, not just wheels. This is definitely out of scope now, but it’d be a good idea to poperly scope the project before deciding on a name.
The only other format is PEP 517 sdists as far as standards are concerned and the sdist->wheel transition would definitely not fit in, or at least be enough complexity to defeat the “a fast tool to populate a package into an environment” goal here.
re: IMO, sans-I/O would be appropriate, with a common-case I/O utility provided on top of it.
Seems like we definitely have different ideas about the project. I was thinking it would fit. The building part can be its own project (and installer depending on it), but the project name sounds to me it should have such an API. But I can be convinced on this
I mean, it does fit in… but I don’t want us to start with that scope/goal.
I do think “start with a smaller scope and grow” would work better for us… plus we can solve the shared install logic implementation problem much more easily than the shared common build logic implementation problem.
What I’m saying is that there’s no reason for installer to not be able to depend on packagebuilder once they both exist; but I’d like to be cautious till then.
A sans-I/O approach would also make supporting installing both from a .whl file or exploded on disk possible.
And I’m obviously up for helping out.
So, what are the next steps? Create a repository, agree on initial scope, and figure out the API? If it is then should the repo be created in the PyPA org on GitHub based on who all is volunteering to help out?
All these sound reasonable to me. I think everyone agrees with the initial scope (to install a wheel). My concern was more about the name of the package, but that can be discussed until there is actually something to release.
PEP 427 already outlines the rough steps to install a wheel. Metadata readers are aleady mostly implemented by importlib-metadata, so what’s left from what I can tell is to parse WHEEL, a RECORDwriter, a script writer, and a nice interface to streamline the usage (that last part sounds difficult already).
Another thing came to my mind reading the installation steps. The last step specifically talked about the uninstaller, which is not mentioned anywhere else in the document. Is this something we should take a look, and potentially include in this tools as well? This “smart enough” part seems quite vague and under-discussed.
use random access to read WHEEL, RECORD and prepare for hash checking
generate series of (paths inside the wheel, and filelike to get the data) - include the ZipInfo for necessary metadata like +x bits and “isdir”
automatically check wheel integrity/consistency here (hook on the readable stream for each archive member, raise error if .close() and hash mismatch)
split paths between {package}.data/category/ /rest/of/path or ‘root of archive’ /rest/of/path for files not in the data directory
map from {package}.data/category or ‘’, to category name one of PURELIB, PLATLIB, SCRIPTS, … at this stage we can no longer tell the difference between files at ‘’ or {package}.data/purelib if Root-Is-Purelib
map from category name to installation target directory
join target directory with /rest/of/path
stream file contents to disk
rewrite legacy scripts etc.,
RECORD
build pyc’s? the ‘smart enough to uninstall’ step just means any files you generate as a result of installing the wheel, also go into RECORD
If steps can be combined or optimized away then that should happen. If it is streaming the installer should be prepared to roll back after an error, say, if the last file doesn’t match its hash.
We want to change step #2 to improve compression so it would be helpful for that to be independent.
Since we’re defining scope here, we should enable signature validation as well, probably just as a hook inside Daniel’s step 1 while reading RECORD (because once we know that file is trusted, we can trust the hashes included in it). Give it the whole metadata directory and let it fail if it doesn’t like something.
It needs to be a hook though, with access to the rest of the wheel contents, as different platforms/users will have different needs here. I expect PyPI to require wheels be unsigned for now.