Finishing the Great Renaming

Long ago in Python 1, there was a “Great Renaming” – adding Py prefixes to all names in the C API, so that the names don’t clash with other code, especially if Python is embedded.
Guido’s 2009 blog post about it has aged very well; read it for more details and rationale (and a nice plan to maintain backwards-compatibility).

But, the renaming wasn’t done for all API: many macros, typedefs and struct tags are still unprefixed. Sometimes new ones are added, either by mistake or for consistency with existing ones (see #118771 for recent ones).
Sometimes a clash is found in real-world code, reported and fixed (e.g. #118207); but I worry that some users will silently work around clashes rather than report them, especially if they’re slow to adopt a new Python version.

I’d like to have some plan about what to do here.
Obviously, we need to be very careful about backwards compatibility; I expect we’ll want to document intentional exceptions, and avoid new ones, more than remove existing API.

I have some opinions, but I’d be interested in brainstorming and hearing what you think :‍)

Here are some things we could do:

  • Add checks to avoid adding more unprefixed names (extending make smelly to macros & types)
  • Document our naming convention, and any intentional exceptions to it
  • Provide prefixed alternatives using e.g. typedef or #define, and encourage linters & foreign-language wrappers to use those
  • Add an opt-in that disables the unprefixed names, so people writing new code don’t need to worry about clashes (for example, PEP 743)

And here’s the list of unprefixed names I found.
We’ll probably need different strategies for different cases:

(click to expand)

Typedefs for function pointers:

getter, setter and destructor are probably the most generic
names Python defines.

getter (descrobject.h)
setter (descrobject.h)
getbufferproc (pybuffer.h)
wrapperfunc (cpython/descrobject.h)
wrapperfunc_kwds (cpython/descrobject.h)
gcvisitobjects_t (objimpl.h)
atexit_datacallbackfunc (cpython/pylifecycle.h)
sendfunc (cpython/object.h)

Most function typedefs are for type slots in object.h:

allocfunc
binaryfunc
descrgetfunc
descrsetfunc
destructor
freefunc
getattrfunc
getattrofunc
getiterfunc
hashfunc
initproc
inquiry
iternextfunc
lenfunc
newfunc
objobjargproc
objobjproc
printfunc
releasebufferproc
reprfunc
richcmpfunc
setattrfunc
setattrofunc
ssizeargfunc
ssizeobjargproc
ssizessizeargfunc
ssizessizeobjargproc
ternaryfunc
traverseproc
unaryfunc
vectorcallfunc
visitproc

Struct tags & typedefs

Various structs:

setentry (cpython/setobject.h)
struct setentry (cpython/setobject.h)
struct wrapperbase (cpython/descrobject.h)
PerfMapState (sysmodule.h)
struct PerfMapState (sysmodule.h)

Simple typedefs:

digit (cpython/longintrepr.h)
sdigit (cpython/longintrepr.h)
twodigits (cpython/longintrepr.h)
stwodigits (cpython/longintrepr.h)

Names starting with _py:

_py_make_codeunit (cpython/code.h)
_py_set_opcode (cpython/code.h)
struct _pycontextobject (cpython/context.h)
struct _pycontexttokenobject (cpython/context.h)
struct _pycontextvarobject (cpython/context.h)

Other underscore-prefixed struct tags:

struct _dictvalues (cpython/dictobject.h)
struct _err_stackitem (cpython/pystate.h)
struct _frame (pytypedefs.h)
struct _frozen (cpython/import.h)
struct _heaptypeobject (cpython/object.h)
struct _inittab (cpython/import.h)
struct _is (pytypedefs.h)
struct _line_offsets (cpython/code.h)
struct _longobject (cpython/longintrepr.h)
struct _object (object.h)
struct _odictobject (cpython/odictobject.h)
struct _opaque (cpython/code.h)
struct _specialization_cache (cpython/object.h)
struct _stack_chunk (cpython/pystate.h)
struct _traceback (cpython/traceback.h)
struct _ts (cpython/pystate.h)
struct _typeobject (cpython/object.h)

Undefined macro

Undefined after use. It should be safe to add the _Py prefix right now; I’ll do that if there are no objections.

NATIVE_TSS_KEY_T (cpython/pythread.h)

Macro families with their own prefixes

CO_ASYNC_GENERATOR (cpython/code.h)
CO_COROUTINE (cpython/code.h)
CO_FUTURE_ABSOLUTE_IMPORT (cpython/code.h)
CO_FUTURE_ANNOTATIONS (cpython/code.h)
CO_FUTURE_BARRY_AS_BDFL (cpython/code.h)
CO_FUTURE_DIVISION (cpython/code.h)
CO_FUTURE_GENERATOR_STOP (cpython/code.h)
CO_FUTURE_PRINT_FUNCTION (cpython/code.h)
CO_FUTURE_UNICODE_LITERALS (cpython/code.h)
CO_FUTURE_WITH_STATEMENT (cpython/code.h)
CO_GENERATOR (cpython/code.h)
CO_ITERABLE_COROUTINE (cpython/code.h)
CO_MAXBLOCKS (cpython/code.h)
CO_NESTED (cpython/code.h)
CO_NEWLOCALS (cpython/code.h)
CO_NO_MONITORING_EVENTS (new in 3.13)
CO_OPTIMIZED (cpython/code.h)
CO_VARARGS (cpython/code.h)
CO_VARKEYWORDS (cpython/code.h)

FUTURE_ABSOLUTE_IMPORT (cpython/compile.h)
FUTURE_ANNOTATIONS (cpython/compile.h)
FUTURE_BARRY_AS_BDFL (cpython/compile.h)
FUTURE_DIVISION (cpython/compile.h)
FUTURE_GENERATORS (cpython/compile.h)
FUTURE_GENERATOR_STOP (cpython/compile.h)
FUTURE_NESTED_SCOPES (cpython/compile.h)
FUTURE_PRINT_FUNCTION (cpython/compile.h)
FUTURE_UNICODE_LITERALS (cpython/compile.h)
FUTURE_WITH_STATEMENT (cpython/compile.h)

FVC_ASCII (ceval.h)
FVC_MASK (ceval.h)
FVC_NONE (ceval.h)
FVC_REPR (ceval.h)
FVC_STR (ceval.h)
FVS_HAVE_SPEC (ceval.h)
FVS_MASK (ceval.h)

METH_CLASS
METH_COEXIST
METH_FASTCALL
METH_KEYWORDS
METH_METHOD
METH_NOARGS
METH_O
METH_STACKLESS
METH_STATIC
METH_VARARGS

SSTATE_INTERNED_IMMORTAL (cpython/unicodeobject.h)
SSTATE_INTERNED_IMMORTAL_STATIC (cpython/unicodeobject.h)
SSTATE_INTERNED_MORTAL (cpython/unicodeobject.h)
SSTATE_NOT_INTERNED (cpython/unicodeobject.h)

Assorted value macros

MAX_CO_EXTRA_USERS (pystate.h)
TYPE_MAX_WATCHERS (cpython/object.h)
WAIT_LOCK (pythread.h)
NOWAIT_LOCK (pythread.h)

Various configure macros (pyconfig.h, pyport.h)

On my system, I get several free-form macros, some of which are unused:

DOUBLE_IS_LITTLE_ENDIAN_IEEE754
ENABLE_IPV6
MAJOR_IN_SYSMACROS
MVWDELCH_IS_EXPRESSION
PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT
RETSIGTYPE
SIGNED_RIGHT_SHIFT_ZERO_FILLS
STDC_HEADERS
SYS_SELECT_WITH_SYS_TIME

There are many ALIGNOF_ and SIZEOF_*; some of these are unused in CPyton,
others (but not all) could be replaced by C11 features:

ALIGNOF_LONG
ALIGNOF_MAX_ALIGN_T
ALIGNOF_SIZE_T
SIZEOF_DOUBLE
SIZEOF_FLOAT
SIZEOF_FPOS_T
SIZEOF_INT
SIZEOF_LONG
SIZEOF_LONG_DOUBLE
SIZEOF_LONG_LONG
SIZEOF_OFF_T
SIZEOF_PID_T
SIZEOF_PTHREAD_KEY_T
SIZEOF_PTHREAD_T
SIZEOF_PY_HASH_T
SIZEOF_PY_UHASH_T
SIZEOF_SHORT
SIZEOF_SIZE_T
SIZEOF_TIME_T
SIZEOF_UINTPTR_T
SIZEOF_VOID_P
SIZEOF_WCHAR_T
SIZEOF__BOOL

Several WITH_* used for Python features:

WITH_DECIMAL_CONTEXTVAR
WITH_DOC_STRINGS
WITH_FREELISTS
WITH_PYMALLOC
WITH_THREAD

and around 400 HAVE_* macros:

(click for the list)

HAVE_ACCEPT
HAVE_ACCEPT4
HAVE_ACOSH
HAVE_ADDRINFO
HAVE_ALARM
HAVE_ALLOCA_H
HAVE_ASINH
HAVE_ASM_TYPES_H
HAVE_ATANH
HAVE_BIND
HAVE_BIND_TEXTDOMAIN_CODESET
HAVE_BLUETOOTH_BLUETOOTH_H
HAVE_BUILTIN_ATOMIC
HAVE_CHMOD
HAVE_CHOWN
HAVE_CHROOT
HAVE_CLOCK
HAVE_CLOCK_GETRES
HAVE_CLOCK_GETTIME
HAVE_CLOCK_NANOSLEEP
HAVE_CLOCK_SETTIME
HAVE_CLOSE_RANGE
HAVE_COMPUTED_GOTOS
HAVE_CONFSTR
HAVE_CONNECT
HAVE_COPY_FILE_RANGE
HAVE_CTERMID
HAVE_CURSES_FILTER
HAVE_CURSES_H
HAVE_CURSES_HAS_KEY
HAVE_CURSES_IMMEDOK
HAVE_CURSES_IS_PAD
HAVE_CURSES_IS_TERM_RESIZED
HAVE_CURSES_RESIZETERM
HAVE_CURSES_RESIZE_TERM
HAVE_CURSES_SYNCOK
HAVE_CURSES_TYPEAHEAD
HAVE_CURSES_USE_ENV
HAVE_CURSES_WCHGAT
HAVE_DECL_RTLD_DEEPBIND
HAVE_DECL_RTLD_GLOBAL
HAVE_DECL_RTLD_LAZY
HAVE_DECL_RTLD_LOCAL
HAVE_DECL_RTLD_MEMBER
HAVE_DECL_RTLD_NODELETE
HAVE_DECL_RTLD_NOLOAD
HAVE_DECL_RTLD_NOW
HAVE_DEVICE_MACROS
HAVE_DEV_PTMX
HAVE_DIRENT_D_TYPE
HAVE_DIRENT_H
HAVE_DIRFD
HAVE_DLFCN_H
HAVE_DLOPEN
HAVE_DUP
HAVE_DUP2
HAVE_DUP3
HAVE_DYNAMIC_LOADING
HAVE_ENDIAN_H
HAVE_EPOLL
HAVE_EPOLL_CREATE1
HAVE_ERF
HAVE_ERFC
HAVE_ERRNO_H
HAVE_EVENTFD
HAVE_EXECV
HAVE_EXPLICIT_BZERO
HAVE_EXPM1
HAVE_FACCESSAT
HAVE_FCHDIR
HAVE_FCHMOD
HAVE_FCHMODAT
HAVE_FCHOWN
HAVE_FCHOWNAT
HAVE_FCNTL_H
HAVE_FDATASYNC
HAVE_FDOPENDIR
HAVE_FEXECVE
HAVE_FFI_CLOSURE_ALLOC
HAVE_FFI_PREP_CIF_VAR
HAVE_FFI_PREP_CLOSURE_LOC
HAVE_FLOCK
HAVE_FORK
HAVE_FORKPTY
HAVE_FPATHCONF
HAVE_FSEEKO
HAVE_FSTATAT
HAVE_FSTATVFS
HAVE_FSYNC
HAVE_FTELLO
HAVE_FTIME
HAVE_FTRUNCATE
HAVE_FUTIMENS
HAVE_FUTIMES
HAVE_FUTIMESAT
HAVE_GAI_STRERROR
HAVE_GCC_ASM_FOR_X64
HAVE_GCC_ASM_FOR_X87
HAVE_GCC_UINT128_T
HAVE_GDBM_H
HAVE_GDBM_NDBM_H
HAVE_GETADDRINFO
HAVE_GETC_UNLOCKED
HAVE_GETEGID
HAVE_GETENTROPY
HAVE_GETEUID
HAVE_GETGID
HAVE_GETGRGID
HAVE_GETGRGID_R
HAVE_GETGRNAM_R
HAVE_GETGROUPLIST
HAVE_GETGROUPS
HAVE_GETHOSTBYADDR
HAVE_GETHOSTBYNAME
HAVE_GETHOSTBYNAME_R
HAVE_GETHOSTBYNAME_R_6_ARG
HAVE_GETHOSTNAME
HAVE_GETITIMER
HAVE_GETLOADAVG
HAVE_GETLOGIN
HAVE_GETNAMEINFO
HAVE_GETPAGESIZE
HAVE_GETPEERNAME
HAVE_GETPGID
HAVE_GETPGRP
HAVE_GETPID
HAVE_GETPPID
HAVE_GETPRIORITY
HAVE_GETPROTOBYNAME
HAVE_GETPWENT
HAVE_GETPWNAM_R
HAVE_GETPWUID
HAVE_GETPWUID_R
HAVE_GETRANDOM
HAVE_GETRANDOM_SYSCALL
HAVE_GETRESGID
HAVE_GETRESUID
HAVE_GETRUSAGE
HAVE_GETSERVBYNAME
HAVE_GETSERVBYPORT
HAVE_GETSID
HAVE_GETSOCKNAME
HAVE_GETSPENT
HAVE_GETSPNAM
HAVE_GETUID
HAVE_GETWD
HAVE_GRP_H
HAVE_HSTRERROR
HAVE_HTOLE64
HAVE_IF_NAMEINDEX
HAVE_INET_ATON
HAVE_INET_NTOA
HAVE_INET_PTON
HAVE_INITGROUPS
HAVE_INTTYPES_H
HAVE_KILL
HAVE_KILLPG
HAVE_LANGINFO_H
HAVE_LCHOWN
HAVE_LIBB2
HAVE_LIBDL
HAVE_LIBINTL_H
HAVE_LIBSQLITE3
HAVE_LINK
HAVE_LINKAT
HAVE_LINUX_AUXVEC_H
HAVE_LINUX_CAN_BCM_H
HAVE_LINUX_CAN_H
HAVE_LINUX_CAN_J1939_H
HAVE_LINUX_CAN_RAW_FD_FRAMES
HAVE_LINUX_CAN_RAW_H
HAVE_LINUX_CAN_RAW_JOIN_FILTERS
HAVE_LINUX_FS_H
HAVE_LINUX_LIMITS_H
HAVE_LINUX_MEMFD_H
HAVE_LINUX_NETLINK_H
HAVE_LINUX_QRTR_H
HAVE_LINUX_RANDOM_H
HAVE_LINUX_SOUNDCARD_H
HAVE_LINUX_TIPC_H
HAVE_LINUX_VM_SOCKETS_H
HAVE_LINUX_WAIT_H
HAVE_LISTEN
HAVE_LOCKF
HAVE_LOG1P
HAVE_LOG2
HAVE_LOGIN_TTY
HAVE_LONG_DOUBLE
HAVE_LONG_LONG
HAVE_LSTAT
HAVE_LUTIMES
HAVE_MADVISE
HAVE_MAKEDEV
HAVE_MBRTOWC
HAVE_MEMFD_CREATE
HAVE_MEMRCHR
HAVE_MKDIRAT
HAVE_MKFIFO
HAVE_MKFIFOAT
HAVE_MKNOD
HAVE_MKNODAT
HAVE_MKTIME
HAVE_MMAP
HAVE_MREMAP
HAVE_NANOSLEEP
HAVE_NCURSESW
HAVE_NCURSES_H
HAVE_NDBM_H
HAVE_NETDB_H
HAVE_NETINET_IN_H
HAVE_NETPACKET_PACKET_H
HAVE_NET_ETHERNET_H
HAVE_NET_IF_H
HAVE_NICE
HAVE_OPENAT
HAVE_OPENDIR
HAVE_OPENPTY
HAVE_PANEL_H
HAVE_PATHCONF
HAVE_PAUSE
HAVE_PIPE
HAVE_PIPE2
HAVE_POLL
HAVE_POLL_H
HAVE_POSIX_FADVISE
HAVE_POSIX_FALLOCATE
HAVE_POSIX_SPAWN
HAVE_POSIX_SPAWNP
HAVE_PREAD
HAVE_PREADV
HAVE_PREADV2
HAVE_PRLIMIT
HAVE_PROTOTYPES
HAVE_PTHREAD_CONDATTR_SETCLOCK
HAVE_PTHREAD_GETCPUCLOCKID
HAVE_PTHREAD_H
HAVE_PTHREAD_KILL
HAVE_PTHREAD_SIGMASK
HAVE_PTY_H
HAVE_PWRITE
HAVE_PWRITEV
HAVE_PWRITEV2
HAVE_READLINK
HAVE_READLINKAT
HAVE_READV
HAVE_REALPATH
HAVE_RECVFROM
HAVE_RENAMEAT
HAVE_RL_APPEND_HISTORY
HAVE_RL_CATCH_SIGNAL
HAVE_RL_COMPDISP_FUNC_T
HAVE_RL_COMPLETION_APPEND_CHARACTER
HAVE_RL_COMPLETION_DISPLAY_MATCHES_HOOK
HAVE_RL_COMPLETION_MATCHES
HAVE_RL_COMPLETION_SUPPRESS_APPEND
HAVE_RL_PRE_INPUT_HOOK
HAVE_RL_RESIZE_TERMINAL
HAVE_SCHED_GET_PRIORITY_MAX
HAVE_SCHED_H
HAVE_SCHED_RR_GET_INTERVAL
HAVE_SCHED_SETAFFINITY
HAVE_SCHED_SETPARAM
HAVE_SCHED_SETSCHEDULER
HAVE_SEM_CLOCKWAIT
HAVE_SEM_GETVALUE
HAVE_SEM_OPEN
HAVE_SEM_TIMEDWAIT
HAVE_SEM_UNLINK
HAVE_SENDFILE
HAVE_SENDTO
HAVE_SETEGID
HAVE_SETEUID
HAVE_SETGID
HAVE_SETGROUPS
HAVE_SETHOSTNAME
HAVE_SETITIMER
HAVE_SETJMP_H
HAVE_SETLOCALE
HAVE_SETNS
HAVE_SETPGID
HAVE_SETPGRP
HAVE_SETPRIORITY
HAVE_SETREGID
HAVE_SETRESGID
HAVE_SETRESUID
HAVE_SETREUID
HAVE_SETSID
HAVE_SETSOCKOPT
HAVE_SETUID
HAVE_SETVBUF
HAVE_SHADOW_H
HAVE_SHM_OPEN
HAVE_SHM_UNLINK
HAVE_SHUTDOWN
HAVE_SIGACTION
HAVE_SIGALTSTACK
HAVE_SIGFILLSET
HAVE_SIGINFO_T_SI_BAND
HAVE_SIGINTERRUPT
HAVE_SIGNAL_H
HAVE_SIGPENDING
HAVE_SIGRELSE
HAVE_SIGTIMEDWAIT
HAVE_SIGWAIT
HAVE_SIGWAITINFO
HAVE_SNPRINTF
HAVE_SOCKADDR_ALG
HAVE_SOCKADDR_STORAGE
HAVE_SOCKET
HAVE_SOCKETPAIR
HAVE_SPAWN_H
HAVE_SPLICE
HAVE_SSIZE_T
HAVE_STATVFS
HAVE_STAT_TV_NSEC
HAVE_STDINT_H
HAVE_STDIO_H
HAVE_STDLIB_H
HAVE_STD_ATOMIC
HAVE_STRFTIME
HAVE_STRINGS_H
HAVE_STRING_H
HAVE_STRLCPY
HAVE_STRSIGNAL
HAVE_STRUCT_PASSWD_PW_GECOS
HAVE_STRUCT_PASSWD_PW_PASSWD
HAVE_STRUCT_STAT_ST_BLKSIZE
HAVE_STRUCT_STAT_ST_BLOCKS
HAVE_STRUCT_STAT_ST_RDEV
HAVE_STRUCT_TM_TM_ZONE
HAVE_SYMLINK
HAVE_SYMLINKAT
HAVE_SYNC
HAVE_SYSCONF
HAVE_SYSEXITS_H
HAVE_SYSLOG_H
HAVE_SYSTEM
HAVE_SYS_AUXV_H
HAVE_SYS_EPOLL_H
HAVE_SYS_EVENTFD_H
HAVE_SYS_FILE_H
HAVE_SYS_IOCTL_H
HAVE_SYS_MMAN_H
HAVE_SYS_PARAM_H
HAVE_SYS_POLL_H
HAVE_SYS_RANDOM_H
HAVE_SYS_RESOURCE_H
HAVE_SYS_SELECT_H
HAVE_SYS_SENDFILE_H
HAVE_SYS_SOCKET_H
HAVE_SYS_SOUNDCARD_H
HAVE_SYS_STATVFS_H
HAVE_SYS_STAT_H
HAVE_SYS_SYSCALL_H
HAVE_SYS_SYSMACROS_H
HAVE_SYS_TIMES_H
HAVE_SYS_TIME_H
HAVE_SYS_TYPES_H
HAVE_SYS_UIO_H
HAVE_SYS_UN_H
HAVE_SYS_UTSNAME_H
HAVE_SYS_WAIT_H
HAVE_SYS_XATTR_H
HAVE_TCGETPGRP
HAVE_TCSETPGRP
HAVE_TEMPNAM
HAVE_TERMIOS_H
HAVE_TERM_H
HAVE_TIMEGM
HAVE_TIMES
HAVE_TMPFILE
HAVE_TMPNAM
HAVE_TMPNAM_R
HAVE_TM_ZONE
HAVE_TRUNCATE
HAVE_TTYNAME
HAVE_UMASK
HAVE_UNAME
HAVE_UNISTD_H
HAVE_UNLINKAT
HAVE_UNSHARE
HAVE_USABLE_WCHAR_T
HAVE_UTIMENSAT
HAVE_UTIMES
HAVE_UTIME_H
HAVE_UTMP_H
HAVE_UUID_GENERATE_TIME_SAFE
HAVE_UUID_H
HAVE_VFORK
HAVE_WAIT
HAVE_WAIT3
HAVE_WAIT4
HAVE_WAITID
HAVE_WAITPID
HAVE_WCHAR_H
HAVE_WCSCOLL
HAVE_WCSFTIME
HAVE_WCSXFRM
HAVE_WMEMCMP
HAVE_WORKING_TZSET
HAVE_WRITEV
HAVE_ZLIB_COPY

6 Likes

It sounds like a good plan. I suggest to start by adding Py names, and keep old names as aliases to Py names. Later we can decide how to deal with old aliases.

4 Likes

I agree that we need to do something here. I’d also like to make sure that developers understand why we add Py prefixes (or _Py – where I write Py below I include this). Without such understanding we will see unnecessary Py prefixes which could confuse readers of the source code.

Python’s original grand renaming was almost exclusively concerned with link-level clashes, since those are virtually impossible to work around when they exist. If two unrelated libraries both define a global function named getitem there’s no (portable) way to use those libraries in the same program, even if they are used by different parts of the program, even if they are technically internal to each library.

The grand renaming didn’t concern itself much with names that only exist at compile time – notably macros, typedefs and struct tags, as Petr mentions – because the conditions required for those to cause a conflict only exist if a Python header file is included by some file that also includes a header file belonging to some other library or framework. Of course, this is much more likely today than it was in 1995, so it’s more than time to do something about it.

I still worry a bit about the requirement for Py prefixes to be mistakenly applied to contexts where it isn’t helpful. We already have a smattering of Py-prefixed static functions in C files. I find those confusing because as a reader encountering one of their uses I am led to believe that the definition might live elsewhere (as might other uses).

There are also some surprising issues around non-linker symbols defined in internal headers. It appears those sometimes end up being included by applications or libraries/frameworks that prefer to break the rules in order to get performance (or sometimes just access to internal semantics), and there they occasionally conflict with symbols defined elsewhere. I prefer that we deal with those on a case-by-case basis rather than insisting that every internal typedef (etc.) must also have a Py prefix.

But apart from those concerns, I agree we should do this!

5 Likes