Python supports integers in unlimited range (if memory is enough), C has several types of integers with limited ranges. There are several ways to convert Python integer to C integer and back:
- Dedicated C API functions like
PyLong_AsLong()andPyLong_FromLong(). PyArg_Parse()with corresponding format unit like'l'.Py_BuildValue()with a similar format unit.PyMemberDefwith corresponding type likePy_T_LONG.
These sets are not equivalent, especially for unsigned integers.
-
Most of C API functions except
PyNumber_AsSsize_t()has thePyLong_prefix. There is usually three variants for conversion to the C integer:PyLong_AsLong()converts integers in rangeLONG_MINtoLONG_MAXto signedlong.PyLong_AsUnsignedLong()converts integers in range 0 toULONG_MAXtousigned long.PyLong_AsUnsignedLongMask()accepts arbitrary integers and convert them tousigned longmoduleULONG_MAX+1.
-
PyArg_Parse()has variants of format units for signed and unsigned types. For example,'l'works likePyLong_AsLong()and'k'works likePyLong_AsUnsignedLongMask(). There is no variant forPyLong_AsUnsignedLong(), the only way to convert tounsigned longwith range check is to use a custom converter. -
PyMemberDefAPI also has variants for signed and unsigned types.Py_T_LONGis equivalent toPyLong_AsLong(), butPy_T_ULONGwhich converts tounsigned longis more tricky. It accepts Python integers in rangeLONG_MINtoULONG_MAX. It is larger than the range ofunsigned long, so it converts negative integers in rangeLONG_MINto -1 moduloULONG_MAX+1.
Why there is so strange API for unsigned types? I think there are several reasons:
- In is not clear whether some types like
uid_tordev_tare implemented as signed or unsigned types (it varies between OSes). - Even if some types are unsigned and supports values larger than maximal limit for corresponding type (like
uid_tordev_ton some OSes), some negative values can still be used as special signs for unknown or unavaliable value, so you can see(uid_t)-1or(size_t)-1in the C code. It is better to accept Python integer -1 as a special value than require to use 4294967295 or 18446744073709551615.
There are also differences in supporting int-like objects with __index__() method, but this is a different painful issue.
Due to to differences between these three sets, it is diffucult to write a code that supports the same range as argument as a value for attribute setters. It is difficult to change the code from using PyArg_Parse() to manual parsing with the C API and vica verse. How can we unify these APIs? API like PyLong_AsUnsignedLongMask() is the most lenient, but it allows integer overflow errors. Should we limit its range as in Py_T_ULONG? Or maybe limit it even more, allowing only -1 as negative value? There is a specialized private C API like _Py_Uid_Converter() which only accepts -1 as negative value. In some cases any negative value is invalid (when we specify a length etc)and all positive values that fits the target type are valid, so there is a value of more strict PyLong_AsUnsignedLong(). Should we add corresponding strict codes in PyArg_Parse() and PyMemberDef?
I am going to add wrappers for some C structs, and need support of types like uint32_t and off_t for this, so I need to resolve these questions for older types before adding support for new types.