Python code errors

Hi,

I’m writing an automated test using python, I got the code sample below from UL can someone help me to covert the code below to python code? I’m a beginner and have no idea.

#!/usr/bin/env python
import socket
import struct
import time
TCP_IP = ‘127.0.0.1’
TCP_PORT = 4500
BUFFER_SIZE = 1024

MESSAGE = “{\”message_type\” : \”Select Profile\”, \”version\” :
\”1.0\”, \”payload\” : { \”profile\” : \”Visa\”, \”description\” :
\”Example project\” } }”

Create message

MLI = struct.pack(“!I”, len(MESSAGE))
MLI_MESSAGE = MLI + MESSAGE

Send message

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
s.send(MLI_MESSAGE)
print “Sent data: ‘”, MESSAGE, “’”

Receive MLI from response (you might want to add some timeout handling

as well
respMLI = s.recv(BUFFER_SIZE)
respMLI = struct.unpack(“!I”, respMLI)[0]

Receive actual message

total_msg =
response = ‘’
while 1:
if len(response) >= respMLI:
break
try:
data = s.recv(BUFFER_SIZE)
if data:
total_msg.append(data)
except:
pass
response = ‘’.join(total_msg)
s.close()

I’m writing an automated test using python, I got the code sample below from UL can someone help me to covert the code below to python code? I’m a beginner and have no idea.

What’s “UL”?

Well first up, it is Python code already. But it is a pair of stand
alone Send and receive just one packet examples. You probably want to
convert each to a small function.

There are some other remarks about the code which I will make inline
below.

#!/usr/bin/env python
import socket
import struct
import time
TCP_IP = ‘127.0.0.1’
TCP_PORT = 4500
BUFFER_SIZE = 1024

If you’re new to Python, these upper case names are a convention for
“constants” - tuning values used elsewhere in the code. Python doesn’t
have real “constants” (you can rebind any name), but we write things
like this, which are no expected to change and which are used like
constants would be in other languages, in upper case.

MESSAGE = “{\”message_type\” : \”Select Profile\”, \”version\” :
\”1.0\”, \”payload\” : { \”profile\” : \”Visa\”, \”description\” :
\”Example project\” } }”

If you have cut/paste this from some example, it has been mangled
(perhaps in the original). This code has “smart quotes”, distinct
opening and closing quote marks. Python does not do that and nor does
JSON. Instead it only uses plain ASCII single quote/apostrophe marks or plain
ASCII double quote marks. So the above needs to be:

MESSAGE = "{\"message_type\" : \"Select Profile\", \"version\" : \"1.0\", \"payload\" : { \"profile\" : \"Visa\", \"description\" : \"Example project\" } }"

It may not be very visible, but I have replaced all the smart quotes
with ASCII double quotes.

Also, the above is hard to read. In Python, and in several other
languages, we normally try to use different Python quote marks from
quote marks embdedded in the string. So since JSON uses lots of ASCII
double quote marks for its strings, we would normally write it enclosed
in ASCII since quote marks:

MESSAGE = '{"message_type" : "Select Profile", "version" : "1.0", "payload" : { "profile" : "Visa", "description" : "Example project" } }'

entirely because then we do not need to “escape” the double quote marks
inside the string.

The value is the same, we have just written it more readably.

Create message

MLI = struct.pack(“!I”, len(MESSAGE))
MLI_MESSAGE = MLI + MESSAGE

So, this combines your message with a binary transcription of the length
of the message. See the “struct” module documentation for details:

https://docs.python.org/3/library/struct.html#module-struct

In short, the “!I” indicates an unsigned 4 byte integer in network byte
order (big endian, meaning that the most significant bytes come first).

Now for the annoying part. The code you have is for Python 2, and modern
Python is Python 3 - Python 2 is end of life.

This thing which gives this away is that the final packet made by the
code above just joins the MLI (message length indicator) to the MESSAGE
string. In Python 2 strings were effectively byte strings, good only for
8 bit character sets and with no indication of what that character set
may have been - there was a whole separate type for “Unicode” strings,
which encompass all human characters.

In Python 3 the string type is Unicode (sequences of Unicode code
points, effectively characters), and there is a separate bytes
type for bytes (sequences of 8-bit values which can be stored in a
single byte each). This avoids a whole suite of dangerous ambiguity
between “is this binary data” versus “is this text”?

What does this mean? Well, you are constructing a binary packet of data
from a JSON string. JSON strings are inherently unicode, so writing
your packet MESSAGE as a string is the correct thing to do. However,
sending a packet sends bytes over the network. So the string must be
converted to bytes.

The JSON specifications say that this is done using the “UTF-8” encoding
scheme. Which is also the default encoding scheme which Python uses if
you convert a string to bytes using its .encode() method. So this is a
reasonable thing to do:

MESSAGE = '{"message_type" : "Select Profile", "version" : "1.0", "payload" : { "profile" : "Visa", "description" : "Example project" } }'
MESSAGE_bs = MESSAGE.encode()
# Create message
MLI = struct.pack(“!I”, len(MESSAGE_bs))
MLI_MESSAGE = MLI + MESSAGE_bs

Here, MESSAGE_bs is a bytes obect containing the JSON string as UTF-8
bytes. We send those bytes in the data packet, so the MLI must measure
the length of MESSAGE_bs, not MESSAGE. And you can join bytes to bytes,
so the final line is just MLI + MESSAGE_bs, much like the original.

Combining all this into a function is the next thing you would do, like
this:

def json_packet(json_message):
    ''' Accept a JSON string to send as a packet.
        Return a bytes object containing the packet.
    '''
    bs = json_message.encode()
    mli = struct.pack(“!I”, len(bs))
    return mli + bs

See that the variable names are in lower case - this is normal for
Python variables. So the above is a function which accepts a JSON string
and returns the packet you would send over the network. It does not send
the packet.

Send message

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
s.send(MLI_MESSAGE)
print “Sent data: ‘”, MESSAGE, “’”

This constructs a network connection sends a packet. I would write this as:

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
s.send(json_packet(MESSAGE))

using the json_packet() function defined above.

The following code then receives a response from the server.
Incorrectly, to my eye.

Receive MLI from response (you might want to add some timeout handling

as well
respMLI = s.recv(BUFFER_SIZE)
respMLI = struct.unpack(“!I”, respMLI)[0]

Here we fetch up to BUFFER_SIZE bytes of data, and unpack the leading
length indicator. And then just discards the rest, which seems very
wrong - you’ll lose the start of the response string. It also doesn’t
accomodate getting less than the length of the length indicator - this
is extremely and wildly unlikely, so I’ll ignore that. But dropping the
start of the response is very bad.

I’d write this as:

mli_length = struct.calcsize("!I")
data = s.recv(BUFFER_SIZE)
mli_bs = data[:mli_length]
response_bs = data[mli_length:]
mli = struct.unpack(“!I”, mli_bs)[0]

which:

  • receives a chunk of data from the server
  • pulls the first “mli_length” bytes of it out as “mli_bs”, the bytes
    containing the message length indicator
  • preserves the remaining bytes for use as the main response
  • decodes the message length indicator into a number using struct.unpack

The [0] on the end is because struct.unpack normally unpacks several
things, so you get an array of values back. We only want the first (and
only) value, thus [0].

Receive actual message

total_msg =
response = ‘’
while 1:
if len(response) >= respMLI:
break
try:
data = s.recv(BUFFER_SIZE)
if data:
total_msg.append(data)
except:
pass
response = ‘’.join(total_msg)
s.close()

Something has removed all the indentation from the code above. I hope it
is intact in the examples you’re working from. Indentation is critiacal
in Python to show what is inside a control struct (such as the “while”
above) and what is not.

I’d indent the above like this (note; untested):

# Receive actual message
total_msg = []
response = ‘’
while 1:
    if len(response) >= respMLI:
      break
    try:
        data = s.recv(BUFFER_SIZE)
        if data:
            total_msg.append(data)
    except:
        pass
response = ‘’.join(total_msg)
s.close()

and then change it, because is has several problems. The trivial stuff
first:

  • Python has a defined value for true, named “True”, so you would say
    “while True:”. “while 1:” will work, but it is not idiomatic and is
    also less clear

  • the first thing that happens inside the loop is a test to see if we
    have enough data, and then we break out of the loop; clumsy. Better to
    make the loop condition be: “do we need more data?” directly.

  • Python indicates failure by raising excpetions, hence the tryexcept
    clause. But it is best to make these as narrow as possible, and to
    handle only exceptions which are expected and sensible to handle.
    This is because an unexpected exception indicates behaviour yu were
    not expecting, and therefore whatever your “except” clause does is
    likely to be inappropriate, concealingly a problem and acting
    incorrectly. This makes for hard to find bugs.

So I would write this loop (again, UNTESTED) like this:

# record the data so far, and its length
received = [response_bs]
received_length = len(response_bs)
# collect more data until we have enough
while received_length < mli:
    data = s.recv(BUFFER_SIZE)
    if data:
        received.append(data)
        received_length += len(data)
received_bs = b''.join(received)
response_bs = received_bs[:mli]
unparsed_bs = received_bs[mli:]
response = response_bs.decode()

So:

  • keep the extra adat received initially
  • collect more data until we have at least mli bytes
  • join the received data together to make a single bytes object
  • take the first “mli” bytes of that as “response_bs”
  • keep any remaining bytes as “unparsed_bs” - maybe more data followed
  • decode “response_bs” into a string ", presumably containing a JSON
    string

Some notes:

  • I’ve dropped the try/except entirely - until you have a specific
    expected exception to handle, do not evenn bother
  • if you put the try/except back, only enclode the “data = s.recv(BUFFER_SIZE)” line
  • if you put the try/except back, make sure it specifies what exceptions
    it handles

For example:

try:
    data = s.recv(BUFFER_SIZE)
except OSError as e:
    print("error receiving: %s", e, file=sys.stderr)
    continue

The example code just says “except:”, known as a “bare except”. It is
legal, but almost always a terrible idea: it catches any exception.
Worse, the example code then just “passes” - it doesn’t display any
information about the exception (to help you debug or report) and it
doesn’t do anything - you’ll just restart the loop.

Consider if the exception was some permanent thing, like a broken
network connection. The “bare except code” would just spin forever,
burning up your CPU and producing no complaints. Very bad.

Hopefully tis gets you started.

Cheers,
Cameron Simpson cs@cskk.id.au (formerly cs@zip.com.au)

You code worked, thanks for the detail explanation.