How to create a csv file from text

I have this fixed width report file:

I need to insert commas between the values, remove all unnecessary spaces and enclose the alpha field on the right in quotes. I can’t see how to do this with pandas to_csv. Please can someone tell me how to do this.

to_csv is for creating a CSV from already loaded data in a Pandas DataFrame.

The natural way to approach the problem is to load the file into a DataFrame first. Assuming that the file you have is indeed fixed width (and it doesn’t merely look fixed width due to clever use of tabs), the appropriate tool is pandas.read_fwf.

See here for a more detailed guide, which I found by putting pandas constant width field into a search engine:

After you have created the DataFrame, you can use its to_csv method normally. The format that you describe should already be the default.

1 Like

Hi Karl, Thanks for this.
I tried to follow the method you suggested, but couldn’t get the right (any) formating. In the end I used this code:

import re
file_lines =[]
filename = 'S285IBUD.CSV'
outfile = open(filename, 'w',   encoding='ISO-8859-1')
with open("S285IBUD.PRN",  encoding='ISO-8859-1') as the_file: 

     cnt = 0
     match = ""
     spc = ' '
     hash = '#'
     for line in the_file:
         #check if new page 
         match = re.findall(r'\f', line)
         if match: cnt = 0
         #check if new line
         match = re.findall(r'\n', line)
         if match: cnt+=1
         #start processing from 7th line, where line length is > 140
         if cnt > 7 and len(line) > 140:
            inp = line 
    #  Insert commas after each field - fixed known positiom
            inp = inp[:11] + ',' + inp[11:]
            inp = inp[:24] + ',' + inp[24:]
            inp = inp[:29] + ',' + inp[29:]
            inp = inp[:43] + ',' + inp[43:]
            inp = inp[:57] + ',' + inp[57:]
            inp = inp[:62] + ',' + inp[62:]
            inp = inp[:77] + ',' + inp[77:]
            inp = inp[:90] + ',' + inp[90:]
            inp = inp[:95] + ',' + inp[95:]
            inp = inp[:108] + ',' + inp[108:]
            inp = inp[:113] + ',' + inp[113:]
            inp = inp[:128] + ',' + inp[128:]           
    #Find start (1st none blank character) of quoted string
            i = 129
            while i < 148:
                if inp[i] != ' ':
    #insert opening quote  
            inp = inp[:i] + '"' + inp[i:]
    #Replace spaces with '#' in quoted string          
            while i < 146:
                if inp[i] == ' ':
                    inp = inp[:i] + '#' + inp[i + 1:]
    #insert closing quote                
            inp = inp[:148] + '"' + inp[148:]
    #insert last comma (before last field)           
            inp = inp[:149] + ',' + inp[149:]
    #remove all spaces           
            newl = inp.replace(" ", "")
    #replace '#' with spaces in quoted string         
            inp = newl.replace("#", " ")
            outfile.write(inp) #The variable line is already formatted correctly!
            print("line :",  inp)
outfile.close() #Close the file when we’re done!

Here’s the input and output:

Like to know what I’m doing wrong


The first thing wrong is showing us an image of the data. Instead, please copy and paste data and format it like the code.

The second thing wrong is that you say that you tried doing it a different way and couldn’t get it to work, and seem to want to know why it didn’t work, but you don’t show us the code for the different way that you tried. If you mean that you think there is something wrong with the code you’re showing now, then you need to explain the actual issue. As far as I can tell, from what I can read in the image, you are getting a result that matches what you asked for originally.

I don’t see how to upload the data, only image is authorized.
Yes the code I posted works, what I said was that I couldn’t get your code to work, and was wondering why 'cause it’s far more concise and shows the power of python.

It’s not uploading. It’s copying and pasting out of the screen that you took that screenshot from, and putting in some formatting markup - the same way that you add code to your posts. If that doesn’t work for some reason, then you’ll need to explain exactly how you are running the code and where/how you see the output.

	I N P U T:   open("S285IBUD.PRN",  encoding='ISO-8859-1') as the_file

    4240.13     4240.13   3                                    13034.71    13034.71   3     4240.13  99      19/09/21       DONALD DUCK   6181
    4000.00     4000.00  90                                   421574.91   421574.91  90     4000.00  99       5/07/17       MICKY MOUSE  82556

     659.24      659.24  71                                   137428.05   137428.05  71      659.24  99      27/06/17 MARKS AND SPENCER 125198

	 O U T P U T:   outfile = open(filename, 'w',   encoding='ISO-8859-1')
4240.13,4240.13,3,,,,13034.71,13034.71,3,4240.13,99,19/09/21,"DONALD DUCK",6181
4000.00,4000.00,90,,,,421574.91,421574.91,90,4000.00,99,5/07/17,"MICKY MOUSE",82556
659.24,659.24,71,,,,137428.05,137428.05,71,659.24,99,27/06/17,"MARKS AND SPENCER",125198

Thank you. But - is there something wrong with this result?

No the result is fine, but like to do it using DataFrame .read_fwf and .to_csv as you suggested.