PEP 641: Using an underscore in the version portion of Python 3.10 compatibility tags

PEP: 641

Title: Using an underscore in the version portion of Python 3.10 compatibility tags

Author: Brett Cannon brett@python.org,

Steve Dower steve.dower@python.org,

Barry Warsaw barry@python.org

BDFL-Delegate:

Discussions-To:

Status: Draft

Type: Standards Track

Content-Type: text/x-rst

Created: 2020-10-20

Python-Version: 3.10

Post-History: 2020-10-20

Resolution:

Abstract

========

Using the tag system outlined in :pep:425 (primarily used for wheel

file names), each release of Python specifies compatibility tags

(e.g. cp39, py39 for CPython 3.9). For CPython 3.10, this PEP

proposes using 3_10 as the version portion of the tags

(instead of 310).

Motivation

==========

Up to this point, the version portion of compatibility tags used in

e.g. wheel file names has been a straight concatenation of the major

and minor versions of Python, both for the CPython interpreter tag and

the generic, interpreter-agnostic interpreter tag (e.g. cp39 and

py39, respectively). This also applies to the ABI tag

(e.g. cp39). Thanks to both the major and minor versions being

single digits, it has been unambiguous what which digit in e.g. 39

represented.

But starting with Python 3.10, ambiguity comes up as 310 does not

clearly delineate whether the Python version is 3.10, 31.0, or

310 as the major-only version of Python. Thus using 3_10 to

separate major/minor portions as allowed by :pep:425 disambiguates

the Python version being supported.

Rationale

=========

Using 3_10 instead of another proposed separator is a restriction

of :pep:425, thus the only options are 3_10 or 310.

Specification

=============

The SOABI configure variable and

sysconfig.get_config_var('py_version_nodot') will be updated to

use 3_10 appropriately.

Backwards Compatibility

=======================

Tools relying on the ‘packaging’ project [2]_ already expect a

version specification of 3_10 for Python 3.10. Keeping the version

specifier as 310 would require backing that change out and

updating dependent projects (e.g. pip).

Switching to 3_10 will impact any tools that implicitly rely on

the convention that the minor version is a single digit. However,

these are broken regardless of any change here.

For tools assuming the major version is only the first digit, they

will require updating if we switch to 3_10.

In non-locale ASCII, _ sorts after any digit, so lexicographic

sorting matching a sort by Python version of a wheel file name will be

kept.

Since PEP 515 (Python 3.6), underscores in numeric literals are ignored.

This means that int("3_10") and int("310") produce the same result,

and ordering based on conversion to an integer will be preserved.

However, this is still a bad way to sort tags, and the point is raised

here simply to show that this proposal does not make things worse.

Security Implications

=====================

There are no known security concerns.

How to Teach This

=================

As use of the interpreter tag is mostly machine-based and this PEP

disambiguates, there should not be any special teaching consideration

required.

Reference Implementation

========================

A pull request [1]_ already exists adding support to CPython 3.10.

Support for reading wheel files with this proposed PEP is already

implemented.

Rejected Ideas

==============

Not making the change


It was considered to not change the tag and stay with 310. The

argument was it’s less work and it won’t break any existing

tooling. But in the end it was thought that the disambiguation is

better to have.

Open Issues

===========

How far should we take this?


Other places where the major and minor version are used could be

updated to use an underscore as well (e.g. .pyc files, the import

path to the zip file for the stdlib). It is not known how useful it

would be to make this pervasive.

References

==========

… [1] Reference implementation

(https://github.com/python/cpython/pull/20333)

… [2] The ‘packaging’ project

(https://pypi.org/project/packaging/)

Copyright

=========

This document is placed in the public domain or under the

CC0-1.0-Universal license, whichever is more permissive.

Local Variables:

mode: indented-text

indent-tabs-mode: nil

sentence-end-double-space: t

fill-column: 70

coding: utf-8

End:

6 Likes

For reference, the PR is at bpo-40747: Make py_version_nodot 3_10 not 310 (PEP 641)

It updates all(?) instances of 310 that may be ambiguous to either 3.10 or 3_10 as appropriate, and so is more significant than as required for this PEP (which is still more significant than the PR for packaging would be to switch everyone to 310).

The whole PR will be blocked on this PEP, since if we decide to reject it because 310 isn’t too ambiguous then there’s no reason to update all the other instances.

If someone wants to make the case that cp310 is ambiguous but (e.g.) python310.zip is not, then we can consider merging only the packaging tag change without updating everything else.

1 Like

FYI the SC made Pablo the PEP delegate.

What if always use exactly two digits for minor version number? “310”, “401”, “3100”. It is more unlikely that we will get 3.100 (in 2111?) than 31.0.

That’s also an option, but it’s swapping one problem for another. For this suggestion you still have broken code that made a bad assumption, but now you have to make sure you zfill every time. :man_shrugging:

The difference is not large.

'%d%02d' % (major, minor)

vs

'%d_%d' % (major, minor)

And parsing:

major = int(version[:-2])
minor = int(version[-2:])

vs

major, minor = version.split('_')
major = int(major)
minor = int(minor)

Also, the advantage is that int('4_1') < int('3_10'), but int('310') < int('401').

It updates all(?) instances of 310 that may be ambiguous to either 3.10 or 3_10 as appropriate

How do you determine what’s appropriate in which case?

Both the PEP and the PR description only mention changing XY to X_Y, but the PR actually does more.
Should the changes to . be discussed in the PEP, or should there be a separate PEP for them? Should the PEP say that all instances of XY should replaced, if that is the goal?

Basically by going to a dot everywhere except where normalisation rules apply. “version_nodot” clearly should not have a dot, for example.

I’m not opposed to updating the PEP to specify other changes, but I don’t think it’s necessary. File system layout isn’t specified in a PEP (but it does get locked at the start of beta), while wheel tags are.

However, in a PEP that merely talks about a change of compatibility tags of binary distributions, it seems rather surprising if suddenly other things are changed as well. E.g. if pycache filenames (magic tags, PEP 3147) change or if extension module filenames change (PEP 3149), I believe this should be mentioned explicitly and be rationalized properly, if the plan is to change them.

As a personal take, I don’t think the current rationale justifies the breakage wrt assumptions about pycache/extension modules filenames. It is not clear to me whether that’s even the case, maybe the filenames are to remain unchanged (i.e. continue to use 310)? I don’t think the specification is clear enough about that. One can go trough the code and grep it for py_version_nodot, but I’d prefer to see the important schemes that change listed explicitly.

In Fedora, we define and use the %{python3_version_nodots} macro that currently evals to 39/310 (via sys.version_info + format()). I plan to do an analysis to see all the use cases for it and share that here next week.

2 Likes

Here is my statistics about the usage of the %{python3_version_nodots} macro in Fedora packages. Most importantly, it is not used that much, only ~60 packages use it. That means even manual corrections of usage are possible.

Bytecode cache suffix

Most prominent usage is in bytecode cache paths. ~25 packages use something like this:

%{python3_sitearch}/__pycache__/dmidecode.cpython-%{python3_version_nodots}*.pyc

Which will end up being either:

/usr/lib64/python3.10/site-packages/__pycache__/dmidecode.cpython-310*.pyc

Or:

/usr/lib64/python3.10/site-packages/__pycache__/dmidecode.cpython-3_10*.pyc

Extension modules suffix

The second one, used by ~15 packages are extension modules paths:

%{python3_sitearch}/dmidecodemod.cpython-%{python3_version_nodots}*.so
/usr/lib64/python3.10/site-packages/dmidecodemod.cpython-310*.so
/usr/lib64/python3.10/site-packages/dmidecodemod.cpython-3_10*.so

Boost

Then we have Boost. I suspect this might be a Fedora specific patch that our boost installs to %{_libdir}/libboost_python%{python3_version_nodots}.so and %{_libdir}/libboost_numpy%{python3_version_nodots}.so. I will talk to the Boost maintainer about this, however, the paths could either be:

/usr/lib64/libboost_python310.so
/usr/lib64/libboost_numpy310.so

Or:

/usr/lib64/libboost_python3_10.so
/usr/lib64/libboost_numpy3_10.so

I suspect we might need to use a dot instead:

/usr/lib64/libboost_python3.10.so
/usr/lib64/libboost_numpy3.10.so

However, 8 packages would need to be adapted if we change the scheme, because the y use things like this:

-D RDK_BOOST_PYTHON3_NAME=python%{python3_version_nodots}
BOOST_PYTHON=boost_python%{python3_version_nodots}
sed -i 's=SET(BOOST_PYTHON_NAMES=& boost_python%{python3_version_nodots}=' ...
export BOOST_PYTHON_LIB=boost_python%{python3_version_nodots}
--with-boost-python=boost_python%{python3_version_nodots}
-DBOOST_PYTHON_LIB_NAME=boost_python%{python3_version_nodots}
sed -i 's|boost_python3|boost_python%{python3_version_nodots}|' setup.py
-DBoostPython_LIBRARIES="%{_python3_lib};%{_libdir}/libboost_python%{python3_version_nodots}.so"

Tox

Then, there is toxenv: 5 packages use something like this:

TOXENV=py%{python3_version_nodots} tox --sitepackages

Which would either be TOXENV=py310 or TOXENV=py3_10. Honestly, as a tox user, I would consider py3_10 very nonintuitive. Tox might however need to support both, if it is indeed decided to approve this PEP as is.

Additionally to the 5 packages, we also define a %tox macro that uses py%{python3_version_nodots} behind the scenes. But we can easily adapt the macro in one location.

No idea

Than we have 2 packages. luxcorerender uses:

%cmake ... -DPYTHON_V=%{python3_version_nodots}

This might or might not be used for boost.

vtk has:

%{_libdir}/*Python%{python3_version_nodots}D.so.*
%{_libdir}/*QtPython%{python3_version_nodots}D.so.*
%{_libdir}/mpich/lib/*Python%{python3_version_nodots}D.so.*
%{_libdir}/mpich/lib/*QtPython%{python3_version_nodots}D.so.*
%{_libdir}/openmpi/lib/*QtPython%{python3_version_nodots}D.so.*

I.e.:

/usr/lib64/*Python310D.so.*
/usr/lib64/*QtPython3_0D.so.*
/usr/lib64/mpich/lib/*Python310D.so.*
/usr/lib64/mpich/lib/*QtPython310D.so.*
/usr/lib64/openmpi/lib/*QtPython310D.so.*

Or:

/usr/lib64/*Python3_10D.so.*
/usr/lib64/*QtPython3_10D.so.*
/usr/lib64/mpich/lib/*Python3_10D.so.*
/usr/lib64/mpich/lib/*QtPython3_10D.so.*
/usr/lib64/openmpi/lib/*QtPython3_10D.so.*

Wrong usage in package names (EPEL only)

Finally, 3 packages try to use the macro in package names, as in:

BuildRequires:  python%{python3_version_nodots}-sphinx

This is wrong, but in all three cases it is in a %if EPEL ≤ 7 section, hence not used in Fedora at all.

Not found problems

I was afraid to see some integral comparisons like this one:

%if %{python3_version_nodots} > 38
...
%endif

But there are currently none.

Conclusion

  • the situation in Fedora packages is far less critical than I was afraid of.
  • I would like to see clarified what of the following will or won’t be affected:
    • bytecode cache suffix,
    • extension modules suffix,
    • toxenv.
  • I’ll talk to the Boost maintainer.
2 Likes

Seems like this naming is coming from mpi/boost upstreams after all.

1 Like
1 Like

Bret, could you please include an alternate idea (always use 2 digits for minor version) with comparison with original idea in the PEP?

I do not insist that this idea better, but it has some advantages, and we usually include alternate ideas proposed during discussion in PEPs.

1 Like

I mean, we can’t backdate that idea anyway, so it’s really a 4.x idea (or a 4.xx idea :wink:). At that point, anything can change at all, so there’s no need to reject the idea now. Accepting the idea now is the equivalent of rejecting the whole PEP and continuing as we are.

1 Like

I just wanted to say explicitly that I consider not breaking the existing tooling and developers’ expectations more important than the disambiguation. I expect more people to be confused that after 37, 38 and 39 we have 3_10, than being confused about whether 310 means 31.0 or 3.10.

I’d suggest to change the tag expectations in wheel/packaging/pip to cp310 instead.

However, either way, I’d like to ask to move this forward before a3, because it is currently very impractical to test anything with the alpha releases of Python 3.10 when we cannot build wheels.

4 Likes

So you basically want to say that going forward all minor versions will be two digits with 0 for padding? I can, but it is more of an involved process change as PEP 425 doesn’t really specify this as legal.

That’s up to @pablogsal accepting the PEP and then merging the requisite PR from @steve.dower in time.

1 Like

Correction, we do have such usage:

rpm-specs/python-biopython.spec
16:%if 0%{?python3_version_nodots} == 38
252:%if 0%{?python3_version_nodots} == 38

rpm-specs/future.spec
216:%if 0%{?python3_version_nodots} > 37
219:%if 0%{?python3_version_nodots} <= 37

rpm-specs/python-jaraco-packaging.spec
44:%if 0%{?python3_version_nodots} < 38

rpm-specs/petsc4py.spec
414:%if 0%{?python3_version_nodots} > 37
419:%if 0%{?python3_version_nodots} < 38
451:%if 0%{?python3_version_nodots} > 37
456:%if 0%{?python3_version_nodots} < 38

rpm-specs/python-qutepart.spec
81:%if 0%{?python3_version_nodots} > 38

Except I was not grepping for the %{?python3_version_nodots} form.
If we change the macro definition to 3_10, this will break.

1 Like

https://github.com/python/peps/pull/1715 has the suggestion from @storchaka for a required double digit minor version as an open issue. I would say that I think it would require buy-in from PyPy to be tenable (/cc @mattip).

1 Like

I don’t expect a double-digit minor version to cause any problems for PyPy.

1 Like

From the PEP:

[double-digit versions without separator] would also require interpreters that
currently have a single digit minor version – e.g. PyPy 7.3 – to
change from pp73 to pp703 or make the switch from their next
minor release onward (e.g. 7.4 or 8.0). Otherwise this would make this
rule exclusive to the cp interpreter type which would make it more
confusing for people.

Is that true? Is it necessary (for humans or machines) to be able to parse the version out of a wheel tag?
I don’t see the issue with CPython using cp310 while PyPy uses pp73 or pp7_3 and my toy implementation uses slepys3_0x2ef. The only thing the tags need is to be different across versions, so the version scheme can be left to the implementation.
Not that I have anything against other implementations using double-digit versions, or indeed underscores. I just don’t see it as a requirement.

2 Likes