Hello Python Discourse!
I have a question regarding the creation of text email messages that don’t end with a line terminator (linesep) using the content manager interface. I hope this is the right place to ask since I am not sure if this is a feature or an issue…
As an example, say I want to encode this string:
$ python3.12
Python 3.12.3 (main, Apr 9 2024, 08:09:14) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> test_body = 'hello\n\nNo EOL here'
With the legacy interface, I can do this:
>>> import email.message, email.charset
>>> UTF8_QP = email.charset.Charset('utf-8')
>>> UTF8_QP.body_encoding = email.charset.QP
>>> msg = email.message.EmailMessage()
>>> msg.set_payload(test_body, UTF8_QP)
>>> str(msg)
'MIME-Version: 1.0\nContent-Type: text/plain; charset="utf-8"\nContent-Transfer-Encoding: quoted-printable\n\nhello\n\nNo EOL here'
With content manager, however, the linesep is always introduced to the end:
>>> msg = email.message.EmailMessage()
>>> msg.set_content(test_body, charset='utf-8', cte='quoted-printable')
>>> str(msg)
'Content-Type: text/plain; charset="utf-8"\nContent-Transfer-Encoding: quoted-printable\nMIME-Version: 1.0\n\nhello\n\nNo EOL here\n'
The same goes with all other transfer encodings too, and I think I know what’s happening…
-
Both
set_payload()
andset_content()
in the examples above ultimately callemail.quoprimime.body_encode()
, which explicitly handles the case of missing EOL. -
The difference is that the default text content setter of
email.contentmanager.raw_data_manager
doesn’t pass the original string, but rather a reassembled string:def _encode_text(string, charset, cte, policy): lines = string.encode(charset).splitlines() linesep = policy.linesep.encode('ascii') ### Highlighted code here: def embedded_body(lines): return linesep.join(lines) + linesep # ^^^^^^^ def normal_body(lines): return b'\n'.join(lines) + b'\n' # ^^^^^ ### A linesep is always appended...
I was wondering if this is the intended behavior? It’s one tiny edge case, but seeing how quoprimime handles trailing linesep so delicately I couldn’t help but confirm if that is the case.
As far as I know this behavior isn’t documented anywhere (although there are many tests that make use of this current behavior), but I could be missing things… I’d like to know what people think about this! ^^