How to read exactly N bytes from stdin in Native Messaging host?

MDN publishes this Native Messaging host written in Python webextensions-examples/ping_pong.py at main · mdn/webextensions-examples · GitHub.

The Native Messaging protocol specifies the host is capable of receiving up to 1MB from the client (browser extension).

The Python Native Messaging host processes up to and echoes back, from the browser

port.postMessage(new Array(174762))

where the resulting JSON length is computed as

873811

using

JSON.stringify(new Array(174762)).length

When I pass

port.postMessage(new Array(174763))

where the resulting JSON length is computed as

873816

the Python Native Messaging host exits.

Using C, C++, JavaScript (QuickJS; Node.js; Deno) I am able to pass

port.postMessage(new Array(200900))

where resulting JSON length is

104501

and get the message echoed back.

How can I modify the below script, particularly how read() is used in getMessage function in be certain to read exactly N bytes from stdin, in a loop perhaps, and leave everything else essentially the same?

#!/usr/bin/env -S python3 -u
# https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Native_messaging
# https://github.com/mdn/webextensions-examples/pull/157
# Note that running python with the `-u` flag is required on Windows,
# in order to ensure that stdin and stdout are opened in binary, rather
# than text, mode.

import sys
import json
import struct

try:
    # Python 3.x version
    # Read a message from stdin and decode it.
    def getMessage():
        rawLength = sys.stdin.buffer.read(4)
        if len(rawLength) == 0:
            sys.exit(0)
        messageLength = struct.unpack('@I', rawLength)[0]
        message = sys.stdin.buffer.read(messageLength).decode('utf-8')
        return json.loads(message)

    # Encode a message for transmission,
    # given its content.
    def encodeMessage(messageContent):
        encodedContent = json.dumps(messageContent).encode('utf-8')
        encodedLength = struct.pack('@I', len(encodedContent))
        return {'length': encodedLength, 'content': encodedContent}

    # Send an encoded message to stdout
    def sendMessage(encodedMessage):
        sys.stdout.buffer.write(encodedMessage['length'])
        sys.stdout.buffer.write(encodedMessage['content'])
        sys.stdout.buffer.flush()

    while True:
        receivedMessage = getMessage()
        sendMessage(encodeMessage(receivedMessage))
except Exception as e:
    sys.stdout.buffer.flush()
    sys.stdin.buffer.flush()
    sys.exit(0)

Since a single socket read could easily be short (and can also be longer than expected - messages can be grouped together), you’ll need a read buffer. But you’re working with stdin, not a plain socket, so it’s possible you can actually use the buffer that already exists. In general, though, the way to read precisely N bytes is like this:

sock = <whatever>
buffer = b""
def read(n):
    while len(buffer) < n:
        data = sock.read(len(buffer) - n)
        if not data: return # failed read, other end gone
        buffer += data
    ret, buffer = buffer[:n], buffer[n:]
    return ret

This is suitable for most messaging formats. You might be able to shortcut some parts of this, but the loop is probably necessary regardless.

Not sure why the message is not echoed to the client. This echoes back

        buffer = b''
        buffer += sys.stdin.buffer.read(messageLength)

not the loop that reads less.