Discouraging use of epoch segments in versions

Since we were talking about simplifying the version specification recently, I would like to bring another proposal: discouraging (non-zero) epoch segments in versions. “Discouraging” as in indicating in documentation that using them is discouraged, since we can’t really remove the support.

As a quick recap, epochs are basically a way of “rebooting” versions. For example, if a project starts using CalVer and then decides it was a bad idea after all, it can increment the epoch and start over, so that 1!1.0 evaluates as a newer version than 2026.1. To the best of my knowledge, epochs are extremely rarely used right now. According to “Reimplementing PEP 440” post, back in 2022 versions with epoch constituted 0.0002% downloads from PyPI. I’ve asked @konstin about it ~two weeks ago, and he confirmed that running the query on 1% tablesample of the current data confirms that result.

Admittedly, “rarely used” doesn’t mean it’s not useful, especially if it addresses a very specific use case. However, it implies that they don’t get much visibility to users, and they probably don’t get the same level of test coverage as more regular versions. And given how many times I’ve seen packages making wrong assumptions (such as “version is dot-separated numbers”) and crashing when you happened to install an RC of their dependency, I dare say suddenly introducing an epoch in a high profile package is bound to cause some issues. Searching further, I’ve stumbled upon “Python version epochs are broken”, though to be honest I’m not convinced that the author wasn’t testing undefined behavior in the first place.

In the upcoming PEP 817, we’ve decided not to support versions with epoch other than zero for “ABI dependencies”, since the complexity outweighed the benefit.

With my personal Gentoo hat on, epochs are a true pain to downstreams. Gentoo versions do not support epochs, and should a project start using them, we’d effectively have to implement some kind of perpetual workaround for it. To my understanding, Debian does support epochs but prefers avoiding them. I’m sure there are other distributions that don’t have an equivalent concept.

If you have to reboot versioning, no solution is really good. However, in my opinion epochs seem more promising than they really are, and may actually cause more trouble than the previous version scheme did. In the end, keeping your versions incremental, even if it meant jumping up to 3000 like salt did, is a cleaner solution and should be recommended over using epochs, if only for the sake of simplicity and portability.

For this reason, I want to propose adding a note to the version specification that epochs are discouraged as they are likely to cause issues with other (admittedly buggy/limited) software, and that using strictly incremental versions is preferable.

8 Likes

Author of the post you linked here! Thanks for referencing it as part of this discussion.

I agree with your perspective on this. I think in my post I was mostly trying to address that CalVer enthusiasts often point to epochs as the path back to SemVer if you wished to do so. I was almost certainly poking around in undefined behavior, but you find those edges pretty quickly when trying to figure out how it works. The definition and implementation of epochs are pretty flawed and I personally don’t think it’s really a viable option.

I’ve written extensively about the challenges of CalVer, but one piece of advice I have for folks moving to CalVer is to go with 26.1.0 rather than 2026.1.0 because at least you can roll forward to something like 70.0.0 rather than 3000.0.0 if you do want to go back.

6 Likes

Download rate is probably a good proxy for how often installers have to deal with epochs, but might be worthwhile to note that the proportion of files with epochs is about two orders of magnitude higher (0.018%):


warehouse=> SELECT
AVG((filename LIKE ‘%!%’)::int) AS ratio_with_epoch,
AVG((filename NOT LIKE ‘%!%’)::int) AS ratio_without_epoch
FROM file_registry;
ratio_with_epoch    |  ratio_without_epoch
------------------------±-----------------------
0.00018466395265386526 | 0.99981533604734613474
(1 row)

I think this helps your case though: filenames with epochs are downloaded at a lower proportional rate than those without, so they are both an edge case and under-used by consumers.

1 Like

I’m +1 adding a note to the spec that epoch versions are discouraged because they aren’t well supported outside Python packaging like OS and 3rd party distros, and they’ve been historically unused so even Python packaging tools may not well support them.

2 Likes

In Fedora, we do support epochs in RPMs, but I am pretty sure all our software that deals with Python package metadata ↔ RPM metadata would blow up. I support this proposal.

EDIT: Apparently, the software handles epochs. But I would not say it’s well tested in reality. pyreq2rpm/pyreq2rpm/pyreq2rpm.py at 2bd45ed81dfcdeb2b3d64c6e3eb46c1d1f53ec69 · gordonmessmer/pyreq2rpm · GitHub

2 Likes

rpm has had has epoch “forever”, but in current documentation use is pretty strongly discouraged. The distro packaging problem of course adds an extra layer: they have to present a rational ordering within that universe, even if some upstreams may not stick to a consistent versioning scheme in theirs.

Oh, nice data. Would you be able by any chance to get the number/percentage of projects that use them?

Sure:

warehouse=> WITH project_stats AS (
SELECT
r.project_id,
– Returns true if ANY file in the project has ‘!’, false otherwise
BOOL_OR(rf.filename LIKE ‘%!%’) AS has_epoch
FROM release_files rf
JOIN releases r ON rf.release_id = r.id
GROUP BY r.project_id
)
SELECT
AVG(has_epoch::int) AS ratio_with_epoch,
AVG((NOT has_epoch)::int) AS ratio_without_epoch
FROM project_stats;
ratio_with_epoch      |  ratio_without_epoch
----------------------------±-----------------------
0.000077499020498490921851 | 0.99992250097950150908
(1 row)

warehouse=> select count(*) from projects;
 count
--------
 728066
(1 row)
1 Like

To my understanding, Debian does support epochs but prefers avoiding them.

Debian’s package specification does have an epoch field, but these are generally “downstream” epochs and don’t reflect upstream version epochs. That is to say, if a project introduced a PEP-440 epoch in its upstream version string, the package in Debian would probably also need an epoch (depending on the reason it was used), but may not have the same epoch value (for example, if the package maintainer had previously introduced an epoch to solve some unrelated downstream versioning problem).

1 Like

Thanks. So that’s… approximately 56 projects, is that correct? More than I’ve expected.

Less:

warehouse=> SELECT COUNT(DISTINCT r.project_id)
FROM release_files rf
JOIN releases r ON rf.release_id = r.id
WHERE rf.filename LIKE ‘%!%’;
 count
-------
    54

(1 row)
1 Like
Here's the list of projects in case anyone's interested (I appear to be 3 short).
antlerinator
cabinet
cfn-review-bot
cg-atv2-python-insert
cg-feedback-helpers
cg-flake8-reporter
cg-pytest-reporter
composition
dap-types
efilter
foursquare
grafana-foundation-sdk
hexdoc
hodor
isodatetime
javascript
javascript-fix
launcher-menus
metomi-isodatetime
ohai
ppsi
pspman
psprint
pycloudlib
pyproject
remindmail
rg-javascript
scipion-ed
scipion-ed-dials
SetSolver1
shared-atomic-enterprise
simplepycons
spalloc
spalloc-server
SpiNNaker-DataSpecification
SpiNNaker-PACMAN
SpiNNakerGraphFrontEnd
SpiNNakerTestBase
SpiNNFrontEndCommon
SpiNNMachine
SpiNNMan
SpiNNStorageHandlers
SpiNNUtilities
sPyNNaker
sPyNNaker-visualisers
sPyNNaker7
sPyNNaker8
sPyNNakerExternalDevicesPlugin
tree-sitter-talon
unpacking
websocket-javascript

It shows that some epoch-using packages are a part of a family of packages that all use epochs.

3 Likes

Thanks, this is interesting. I’ve taken a quick look through most of the listed packages, and I’d conclude that:

  1. The most common use case for epochs is indeed switching back from CalVer to some kind of short x.y.z versioning (possibly SemVer). There was also a case of switching from YYYYMMDD CalVer to YYYY.MM.DD CalVer.
  2. A few projects have been using epochs to restart X.Y.Z versioning. Most of the time, it was used only once, but a few have been resetting versions multiple times. The relevant X values were always low.
  3. There’s been a project that used epochs to switch from X.Y.Z to CalVer (i.e. unnecessarily, since CalVer evaluated as newer anyway).
  4. There’s been a bunch of projects that used a single non-zero epoch. I think in some cases this may have been because they were supposed to match the versions (with epoch) of some other project.
  5. A few projects have been using epochs as a kind of version separator, i.e. effectively having E!X.Y.Z versions. In one cases, it seems that multiple 0!X.Y.Z, 1!X.Y.Z, etc. wheels were published simultaneously.
  6. Finally, there’s a few cases where people have simply been playing with versions or possibly testing epoch behavior.
3 Likes

It looks like we can reasonably deprecate them, and discourage their use without being too disruptive to the ecosystem.

That feels like a small enough thing that it doesn’t need a PEP, and a PR to packaging.python.org w/ a link to this thread would be sufficient to update the specification here.

There’s maybe a wider question of actively dropping/removing support – that would need a PEP but seems unnecessary for the motivation in OP here. :sweat_smile:

4 Likes

Filed Discourage use of version epochs by mgorny · Pull Request #1994 · pypa/packaging.python.org · GitHub.

1 Like