Error tables in the ssl module

davidben · April 3, 2023, 4:33pm

Hi CPython folks,

If OpenSSL were to have the ERR_reason_symbol_name and ERR_reason_symbol_name APIs described in Getting the symbol name of an error library or reason · Issue #19848 · openssl/openssl · GitHub, would that meet your needs around make_ssl_data.py and friends?

CPython currently has these _ssl_data*.h files in the SSL module that replicate the error tables in OpenSSL. This seems not great. These tables need to duplicated for a couple OpenSSL versions, and you have to update them whenever OpenSSL adds new errors. Moreover, as ABI-compatible OpenSSL releases may add new errors, updating OpenSSL from, say, 3.0 to 3.1 without updating Python will be missing some errors.

From what I can tell, this data is only used to turn ASN1_R_HEADER_TOO_LONG into "HEADER_TOO_LONG". (I’m running into link limits, but grep for err_codes_to_names.) That string is then reported in the reason attribute on the exception object. I filed above issue in OpenSSL. Before just implementing that API, I’d like to confirm this would indeed meet your needs and be an improvement to you all. Of course, any new API will only exist in newer OpenSSLs, but I’m thinking the story can be something like:

Under an OPENSSL_VERSION_NUMBER >= ... ifdef, just call the new APIs and punt all the old bits.
For older OpenSSLs, maintain _ssl_data*.h files with the existing script.

While you’ll still have the _ssl_data*.h files for older OpenSSLs, hopefully you won’t have to update them much anymore, since any error codes introduced in newer OpenSSLs would come with the new API. And it means Pythons compiled against OpenSSL 3.X won’t be missing any new errors added in OpenSSL 3.X+1.

Thoughts?

David

PS: I should explain my motivations here. I work on BoringSSL at Google. We carry some patches to build CPython against BoringSSL. These files are some of heavier patches for us, as a table of every error in OpenSSL and their numeric values is a bit overfitting. I’d like to find a cleaner way to do what you all are trying to here. But, as BoringSSL targets more limited use caes, we don’t like to burden the open source ecosystem with BoringSSL ifdefs. Instead, we usually only upstream patches that we think improve the project’s use of OpenSSL itself. So, while I’m personally looking at this because of BoringSSL, please evaluate this based on what you’d like out of OpenSSL.

(Doing this in BoringSSL wouldn’t even require new APIs anyway. Our ERR_reason_error_string already returns "HEADER_TOO_LONG", instead of "header too long". But I’d like to find a solution that’s useful for you all too.)

steve.dower · April 3, 2023, 6:49pm

Looking at the data file, I’d definitely be in favour of being able to get the same data directly from the library. We technically expose the table as a data structure, which is harder to deal with, but it doesn’t seem to be public API so we can probably just drop them.

We’re light on for ssl module maintainers right now, so I’m not sure how definitive an answer you’ll get. But it’s a +1 from me, and I’m sure someone will do the work to adopt the new API on our side.

alex_Gaynor · April 3, 2023, 9:56pm

As a mostly-absentee maintainer of the SSL module, I’m also in favor of this. While it’s exposed in _ssl, it doesn’t seem to be re-exported from ssl.

David also has gh-100062: Remove error code tables from _ssl and err_names_to_codes by davidben · Pull Request #100063 · python/cpython · GitHub up to un-export it, which I think we should merge (I was hoping one of the more active SSL module maintainers would chime in :-))

steve.dower · April 3, 2023, 10:45pm

Merged now (I had already checked that it isn’t exported from ssl, so I feel it was a responsible merge, despite happening quite quickly)