Socketserver readln not working, receive until \n

Hello,

I am using the very nice socketserver from Python 3.7 to receive text data that is always terminated with ‘\n’ over tcp. Adapting the example form the Python official documentation using socketserver.StreamRequestHandler and self.rfile.readline() does not work:

The data to receive uses ascending numbers to verify if lines are missing. Often lines are missing. The connection is fine, using nc as server receives all lines without any problems.

Using socketserver.BaseRequestHandler works better, but I have to set the amount of bytes to read - which is changing every line. A loop in the “handle” function with self.request.recv(1) and checking for the “\n” character does not work because self.request.recv(1) only recieves one byte, all the rest of the line is lost.

Anyone a suggestion how to receive all \n terminated lines with socketserver?

Best regards,

Mart

Please post a small example that shows the problem for us to review.

Hi Barry,

Thx! Below the code:

#!/usr/bin/env python
#-*- coding: utf-8 -*-
import socketserver
import datetime
import MySQLdb as mdb
import time
import threading

"""
The request handler class for our server.

It is instantiated once per connection to the server, and must
override the handle() method to implement communication to the
client.
"""
lastseq = 0;
doVerbose = True
carinfo = {}
databuffer = []

class TCPHandlerFs(socketserver.StreamRequestHandler):

    def ToDict (self, syslog):
        #create data dict
        datadict = {}
        syslog = syslog.split()
        if len (syslog) >= 1:
            #header
            datadict['CAR'] = syslog[1]
            #fields
            for dta in syslog:
                dta = dta.split ('=')
                if len(dta) == 2:
                    datadict[dta[0].upper().strip('"')] = dta[1].strip ('"')
        return datadict

   def CreateVerbose (self, datadict):
        global carinfo
        result = datadict['DATETIME']  
        return result
 
    def handle(self):
        global doVerbose
        global databuffer
        # self.rfile is a file-like object created by the handler;
        # we can now use e.g. readline() instead of raw recv() calls
        self.data = self.rfile.readline().strip()
        #print("{} wrote:".format(self.client_address))
        #Decode syslog entry
        recData = self.ToDict (self.data.decode('utf-8'))
        if len(recData) == 0:
            return
        #Handle data
        #car 
        if recData['CAR'] == 'new car':
            if doVerbose:
                print ('info - new car')
                print (self.CreateVerbose (recData))
        # Likewise, self.wfile is a file-like object used to write back
        # to the client
        # self.wfile.write(self.data.upper())
        databuffer.append (recData)


class TCPHandlerRs(socketserver.BaseRequestHandler):

    def handle(self):
        # self.request is the TCP socket connected to the client
        self.data = self.request.recv(1024).strip()
        print("{} wrote:".format(self.client_address[0]))
        print(self.data)
        # just send back the same data, but upper-cased
        # self.request.sendall(self.data.upper())
        #
        #not used/not working because fixed recv

 
class  CarServer (object):

    def __init__(self):
        # settings
        self.doverbose = False
        self.userbreak = False
        self.datalooprunning = False
        self.main_thread
        self.dbsource_prefix = "cardb_"
        self.intdbprefix = "card"
        self.configdb = self.intdbprefix  + 'config'
        # db sources and connections
        self.con_settings = None
        self.sqlcur_settings = None
        self.dbuser = 'car'
        self.dbpasswd = 'car'

    def quit(self):
        #self.db_disconnect (self.con_settings)
        if self.doverbose:
            print ('Exiting CarSystem')

 
    def threadscan (self):
        global databuffer
        global carinfo
        while not self.userbreak:
            try:
                if len (databuffer) > 0:
                    recData = databuffer.pop (0)
                    #now add to database
                    if recData['KIND'] == 'CAR':
                        #todo
                        print ('LOOP: car')
            except:
                if self.doverbose:
                    print ('ERR: in thread - recovering')


    def startDataLoop (self):
        #connect to database
        #self.db_settings_connect ('localhost', self.dbuser, self.dbpasswd)       
        #self.create_configdb ()
        #Start thread
        self.userbreak = False        
        self.main_thread = threading.Thread(target=self.threadscan)
        try:
            self.main_thread.start()
            if self.main_thread.isAlive ():
                self.datalooprunning = True
                if self.doverbose:
                    print ('OK: Dataloop running')
            else:
                if self.doverbose:
                    print ('ERR: Dataloop not running')
        except:
            print ('ERR: could not start dataloop')


    def run(self):
        #start databuffer thread
        self.startDataLoop ()
        if self.datalooprunning:
            #socketserver 
            HOST, PORT = "", 10500
            # Create the server, binding to localhost on port 10500
            server = socketserver.TCPServer((HOST, PORT), TCPHandlerFs)
            # Activate the server; this will keep running until you
            # interrupt the program with Ctrl-C
            try:
                server.serve_forever()
            finally:
                server.server_close()
                self.quit ()
        #exit
        if self.doverbose:
            print ('Leaving application')
        

if __name__ == "__main__":
    app = CarServer()
    app.run()

I did a quick read, may have missing stuff.

You cannot depend on the recv giving you a single line.
Reading 1024 is a good idea. But you must expect two cases to need handling.

  1. You do not get a complete line. Keep adding further recv() data until the buffer contains your \n delimiter.
  2. You may get part or two lines. I. This case you need extract the first line and leave the start of the next line in the buffer for later processing.

Fyi 1024 read size: I would read bigger sizes, min MTU size of about 1500.
If you can be sent many outstanding lines then, if performance is an issue use a lot bigger.

Oh and the use of strip() means you will not know if the \n was received.

Hi Bary,

Thanks again! The code was a bit messy. I first tried this (socketserver.StreamRequestHandler):

def handle(self):
        global doVerbose
        global databuffer
        # self.rfile is a file-like object created by the handler;
        # we can now use e.g. readline() instead of raw recv() calls
        self.data = self.rfile.readline().strip()
        #print("{} wrote:".format(self.client_address))
        #Decode syslog entry

It has a build in readline, working fine, but missing substantial lines. All lines the client sends have the format:

datetime type car=“hhh” size=“” data=“” line=“”\n

NetCat recieves all ok, not missing any line.

So I thought, maybe “rfile.readline()” has a bug, then tried
(socketserver.BaseRequestHandler):

def handle(self):
        # self.request is the TCP socket connected to the client
        self.data = self.request.recv(1024).strip()
        print("{} wrote:".format(self.client_address[0]))
        print(self.data)
        # just send back the same data, but upper-cased
        # self.request.sendall(self.data.upper())
        #
        #not used/not working because fixed recv

But because it has a fixed buffer I receive the lines like:

datetime type car=“hhh” size=“” data=“” line=“”\ndatetime type car="hh

So data is missing. My thought then was something like this:

def handle(self):
       ready = false
       buffer = ''
        while not ready:
            self.data = self.request.recv(1)
            if data:
                self.data = self.data.decode('utf-8')
                buffer = buffer + self.data
                if self.data == "\n":
                    ready= true

result where lines like

d
d
d
d
(so every time only the “d” from
datetime type car=“hhh” size=“” data=“” line=“”\n)

(first character)

Now I am stuck, seems

while not ready:
            self.data = self.request.recv(1)

using a loop is not possible in socketserver.BaseRequestHandler “handle”

The strip() is only used in:

self.rfile.readline().strip()

Which already did the “readline”… but is “eating” lines.

Any other suggestions?

Tcp is a stream of bytes. It is not a sequence of messages.
You have to read the bytes are break it up into messages.
In your case you considered a message to be a sequences of bytes tgat end in a newline.

The recv() call will return anywhere from 0 to size bytes.
You may have to make multiple calls to recv until you have colllect a whole line.
Also, as you showed, you can get one line with part of the next line.
It is your job to join together and split a buffer into the lines.

The recv() will not lose data on a tcp stream.

What happens if you call recv() in a loop?
I suspect you will keep getting blocks data.
Print out the blocks as repr to see exactly what you get.
When there is no more data you will get either a 0 bytes retirn or an error EWOULDBLOCK.

Can you show us (or describe for us in detail) it not working, because
it should work. The nice thing about the file-like wrapper is that it
takes care of glomming the results of recv() into a continuous stream,
so that you can read “lines” from it easily.

Note that most output streams have some buffering. So the sending
(server) side probably needs to flush the buffer to ensure that a line
goes out.

So at the server end:

 print("blah", file=outputf)

which writes a newline terminated string to outputf, may not result in
any data sent over the network at that point. You may need:

 outputf.flush()

to force it out (because the output buffer isn’t full). If you don’t do
that at salient points the receiving end will block waiting for the
newline which hasn’t been sent yet.

The reason there’s often buffering (both with networks and ordinary file
I/O) is that is makes more efficient use of the stream. So you might go:

 # send several things in one go
 for obj in results_of_query():
     print(str_form_of_obj_maybe_json, file=outputf)
 outputf.flush()

to fill as much buffer as possible before forcing the send. The “file”
outputf will send anyway if the buffer fills, but you need a final
flush to send whatever partial buffer make be lurking at the end of the
loop to ensure that the receiver gets timely data. If you’re having a
conversation with the receiver, failure to flush can lead to deadlock
where the receiver blocks wait for unflushed data and the server blocks
because it never receives more requests from the (blocked) receiver. So
you flush when you need to ensure that all the data have been sent, not
necessarily at every line of data.

Note that all this is to do with the file-like object. A write() to
the file-like object will also buffer. A write() to the network socket
probably does a direct send().

If you’re print()ing to the file-like object (a natural thing to do if
you’re sending lines of text) you can go:

 print("blah", file=outputf, flush=True)

which does an immediate flush.

Cheers,
Cameron Simpson cs@cskk.id.au

Hi Cameron, Scott,

Thanks for your kind advice. … I made a huge thinking mistake…

def handle(self):
        self.data = self.rfile.readline().strip()

… my code dit not loop self.rfile.readline().strip() till not self.data , so it just did one readline.

After fixing this, readline / socketserver works great and very stable not missing one line.

Best regards.

Mart.

Thanks for your kind advice. … I made a huge thinking mistake…

We all make mistakes like this, and talking things through helps
identify things. I think this is because in order to describe things to
others one must reread what’s actually happening in the code, as opposed
to what one believed was happening in the code.

def handle(self):
       self.data = self.rfile.readline().strip()

… my code dit not loop self.rfile.readline().strip() till not self.data , so it just did one readline.

Note that that will stop if you get an empty line also, eg output like:

  data
  more data

  fourth line after a blank line

readline().strip() would produce an empty (“false”) self.data for
the thrid line. And your loop would stop before the fourth line.

It may be that your server never does this, but be aware that your loop
reads until EOF or an empty line.

You could try this:

 for line in self.rfile:
     self.data = line.strip()
     ... work with self.data ...

so that the iteration runs until EOF. And I’d use .rstrip() or even
.rstrip("\n") myself.

Cheers,
Cameron Simpson cs@cskk.id.au

Hi Cameron,

Thanks for your advice and explanation! I will implement your much better solution.

Best regards!