Add constants to struct module

The struct module has various formatting characters, such as b for signed char, > for big-endian etc.
Developers need to use strings like '>bhQII?' to represent the packing/unpacking format.
Such strings are hard to build, error prone, and unreadable:

  • Hard to build: No one remembers all these values by heart. It is very frustrating to open and read the documentation every time one builds such a string.
  • Error prone: It is so easy to confuse and use B instead of b, or > instead of <.
  • Unreadable: Reading the format string is not intuitive at all.

It would be nice to define each character as a constant value, something like this:

...
BIG_ENDIAN = '>'
LITTLE_ENDIAN = '<'
...
SIGNED_CHAR = 'b'
UNSIGNED_CHAR = 'B'
...

This will let us build readable format strings easily:

format_str = f'{struct.BIG_ENDIAN}{struct.UNSIGNED_CHAR}{struct.SIGNED_LONG_LONG}'  # == '>Bq'
#  (Just to make sense - I checked and fixed myself 7 times against documentation page to verify I represented this correctly)

It is much less error prone, much more readable, and easy to build and maintain. Especially when we have the IDE auto-complete.

What’s your opinion?

If you have the constants, why not put them into a tuple instead of a string?

While the existing method isn’t particularly readable, I don’t find the proposed alternative any better. Rather than being too short, it’s too long, and anything non-trivial will likely start needing to be split over multiple lines. I see the struct module as a low level tool, and as such, terseness is usually a benefit. Consider regular expressions as an example of something similar in a different context.

Personally, I’d use the existing string with an explanatory comment, or write a “format builder” function specific to my application if I needed to write more than one format string.

1 Like

Are you sure about the internal implementaion of pack() / unpack()?
If they just iterate over format, using tuple sounds good too.

I had the opportunity to use the struct library last week so I feel the OP’s pain. Your comment is exactly what I was thinking though, this sounds like a job for a format builder library. PyPI is a great place for such libraries to prove themselves before the standard library is considered.

I don’t know, but you are already proposing a change. Converting a tuple of constants (should be Enum values, I suppose) to a string at some step would be trivial.

Personally I think the existing format is perfectly fine. The struct module isn’t exactly something where you can get away without carefully reading the documentation, anyway.

For making struct more readable, should we think along ctypes path?

I use struct module a lot at my job. From my experience the pain you feel when starting with struct goes away in couple of weeks.
Also, as already mentioned by Paul Moore almost all of fromats I build with some sort of format builder.

If you need something more expressive you might consider using Construct package.

2 Likes

Instead of making struct more like crypes you can just use ctypes.

The ctypes module is perfectly usable for applications that do not involve loading DLLs or calling C functions e.g. speaking some binary protocol over a socket.

And with my proposed annotation-based sugar (see adjacent thread) the syntax is much nicer.