parser = argparse.ArgumentParser(description = 'Read a file of number arrays and output it tab seperated')
parser.add_argument('--infile', '-i', type = str, required = True, help='specifiy input file name')
parser.add_argument('--outfile', '-o', type = str, default='numbers.txt', required = False, help='specify output file name')
if __name__ == '__main__':
args = Input()
# Check if the filess are txt files so we have some security checks at least
if ('.txt' in args.infile and '.txt' in args.outfile):
# Open function returns a iterable file object we can iterate over with a for loop
# The .strip() is used the remove the \n (newline) character after each line
# The with is used to close it when the code block finishes. We don't need to do this ourself then
with open(args.infile, encoding="utf-8") as f:
for line in f:
curr_line = line.strip()
print('Files must be valid .txt files...')
I can only get a String array from the file and it doesn’t really work when I then try to split it with ‘\t’. I have to do this by using list comprehension by the way. I am fairly new to python and hope you can help me please. For the printing or writing with using list comprehension I would use the following code:
print('\t'.join([str(x) for x in curr_line]))
Issue: My key issue here is that I can’t read the lines properly as they are somehow still strings like “[1,2,3,4,5]” or so. Hope you can explain it and help me please!
Note that there is probably a way to strip out the ‘[’ and ‘]’ characters when reading the file using an appropriate regex for the sep argument. The two lines in the middle (starting with df.iloc) could then be omitted. My regex-foo is too weak to figure it out at the moment.
Try this. It only uses standard library and built in functions.
import csv # python has a csv module, see docs.
outfile = 'x.csv'
infile = 'x.data'
with open(outfile, 'w', newline = '') as fout: # need to add "newline = ''"" as csv does it own thing see docs
write = csv.writer(fout) # get a csv writer to output formatteded dats
with open(infile) as fin: # files closed at end with block
for line in fin: # looping over a text file gets you a new line each time.
line = line.strip() # get rid of excess space and end of line
lst = eval(line) # this works as the data in the input file looks like a list on each line
write.writerow(lst) # splits the list and adds the commas
Its not the shortest way to do it but I think its clear
Thank you for that answer! It works nicely but I never used pandas before and it says read_csv so this may not be ok for the task I was given. But it is a really nice solution. I am curious if it is possible without any libraries.
Thank you for that answer! It works great although I don’t really know that “ast” module or understand it yet. What whould happen if the file contains “1, 2, 3, 4, 5” without a “” enclosing it? I guess I have to assume it always does.