Why build wheels for pure Python projects?

mgorny · September 8, 2023, 6:04am

I think this is the key question, so I’m going to focus on that. While the use of two (or more) formats is largely a historical thing, there is still value in that.

Python packages vary a lot. Some are a pure set of .py files with static metadata. Some feature some kind of dynamic metadata — e.g. read version from git tags or Python files. Some feature “plain” C or Rust extensions. Some actually generate .py or C files using different tools. Some have non-Python dependencies. There’s a lot of use cases to cover.

The current standards are doing their best to cover as many of them as possible but there are drawbacks. PEP 517 makes it possible to provide a semi-consistent API for building a lot of different projects, effectively covering a lot of use cases. However, supporting these use cases requires a lot more complexity than your average “pure Python wheel”, so naturally having and distributing two formats makes things faster. Wheel has everything static, so it can be processed and installed almost immediately. Sdist is “dynamic” and requires invoking the build system that may have additional dependencies and that can be slow.

I suppose you could try to devise integrating both formats but that would add a complexity with no really clear advantage. Furthermore, covering different use cases with one file would inevitably cause them to become much larger. Just imagine that some C extensions link to large static libraries — sdist is relatively small, wheels are huge. The opposite also could happen — for if sdist contains tests, it can be much larger than wheels. Shoving everything into every wheel (and you need many for many different targets) would cause a lot of duplication.