Handling of trailing linesep by set_content()

Hello Python Discourse!

I have a question regarding the creation of text email messages that don’t end with a line terminator (linesep) using the content manager interface. I hope this is the right place to ask since I am not sure if this is a feature or an issue…

As an example, say I want to encode this string:

$ python3.12
Python 3.12.3 (main, Apr  9 2024, 08:09:14) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> test_body = 'hello\n\nNo EOL here'

With the legacy interface, I can do this:

>>> import email.message, email.charset
>>> UTF8_QP = email.charset.Charset('utf-8')
>>> UTF8_QP.body_encoding = email.charset.QP
>>> msg = email.message.EmailMessage()
>>> msg.set_payload(test_body, UTF8_QP)
>>> str(msg)
'MIME-Version: 1.0\nContent-Type: text/plain; charset="utf-8"\nContent-Transfer-Encoding: quoted-printable\n\nhello\n\nNo EOL here'

With content manager, however, the linesep is always introduced to the end:

>>> msg = email.message.EmailMessage()
>>> msg.set_content(test_body, charset='utf-8', cte='quoted-printable')
>>> str(msg)
'Content-Type: text/plain; charset="utf-8"\nContent-Transfer-Encoding: quoted-printable\nMIME-Version: 1.0\n\nhello\n\nNo EOL here\n'

The same goes with all other transfer encodings too, and I think I know what’s happening…

  • Both set_payload() and set_content() in the examples above ultimately call email.quoprimime.body_encode(), which explicitly handles the case of missing EOL.

  • The difference is that the default text content setter of email.contentmanager.raw_data_manager doesn’t pass the original string, but rather a reassembled string:

    def _encode_text(string, charset, cte, policy):
        lines = string.encode(charset).splitlines()
        linesep = policy.linesep.encode('ascii')
        ### Highlighted code here:
        def embedded_body(lines):
            return linesep.join(lines) + linesep
        #                                ^^^^^^^
        def normal_body(lines):
            return b'\n'.join(lines) + b'\n'
        #                              ^^^^^
        ### A linesep is always appended...
    

I was wondering if this is the intended behavior? It’s one tiny edge case, but seeing how quoprimime handles trailing linesep so delicately I couldn’t help but confirm if that is the case.

As far as I know this behavior isn’t documented anywhere (although there are many tests that make use of this current behavior), but I could be missing things… I’d like to know what people think about this! ^^