Overwriting a file in Python

I am using the following code to write two files from the original data.dat file. Everytime I run it, it appends to the output files the outcome again and again and the files becomes too large. I want to overwrite the files sss.txt and rrr.txt everytime I run so that I do not have extra information.

Can someone please help?

import os
import pandas as pd

d = ‘/home/Example1’
l = os.listdir(d)

for i in l:
p = os.path.join(d, i)
if os.path.isfile(p) and i == ‘data.dat’:

    with open(p, 'r') as f:
        for index, line in enumerate(f.readlines()):
            if index % 2 == 0:
                with open(os.path.join(d, 'sss.txt'), 'a') as s1:
                    s1.write(line)
            elif index % 2 == 1:
                with open(os.path.join(d, 'rrr.txt'), 'a') as s2:
                    s2.write(line)
            elif index % 2 == 2:
                with open(os.path.join(d, 'Safety_file.txt'), 'a') as s3: 
                    s3.write(line)

To overwrite the files sss.txt and rrr.txt every time you run the code, you need to open them with the mode 'w' instead of 'a'. The 'w' mode overwrites the existing file, whereas the 'a' mode appends to the existing file. So you need to change these lines:

luaCopy code

with open(os.path.join(d, 'sss.txt'), 'a') as s1:
    s1.write(line)
elif index % 2 == 1:
    with open(os.path.join(d, 'rrr.txt'), 'a') as s2:
    s2.write(line)

to:

luaCopy code

with open(os.path.join(d, 'sss.txt'), 'w') as s1:
    s1.write(line)
elif index % 2 == 1:
    with open(os.path.join(d, 'rrr.txt'), 'w') as s2:
    s2.write(line)

Here’s an example: Einblick

1 Like

You are opening the files in “append” mode, which is why new data is being appended.

open(os.path.join(d, 'sss.txt'), 'a')

Use “write” mode instead: open(os.path.join(d, 'sss.txt'), 'w')

1 Like

When I use the ‘w’ mode I am only seeing 1 row(last one) in the file.

Whereas with ‘a’ mode I see full results which I want to.

I don’t know how to fix this hence I was using ‘a’, can you please help?

When I use the ‘w’ mode I am only seeing 1 row(last one) in the file.

Whereas with ‘a’ mode I see full results which I want to.

I don’t know how to fix this hence I was using ‘a’, can you please help?

For each line, you:

  • open the output file;
  • write one line;
  • close the output file.

The close happens when you leave the with open(...) as ... block.

If you use “w” mode, then each time the file is opened, it overwrites what was already there.

If you use “a” mode, you keep appending from one run to the next, and the file ends up with multiple copies of the data.

Solution: use “w” mode, but only open the file once instead of once per line.

Try something like this. (Untested, you will probably need to adjust it to make it work correctly.)

s1 = open(os.path.join(d, 'sss.txt'), 'w')
s2 = open(os.path.join(d, 'sss.txt'), 'w')
for i in l:
    p = os.path.join(d, i)
    if os.path.isfile(p) and i == 'data.dat':
        with open(p, 'r') as f:
            for index, line in enumerate(f.readlines()):
                if index % 2 == 0:
                    s1.write(line)
                elif index % 2 == 1:
                    s2.write(line)
                elif index % 2 == 2:
                    print("This is impossible! The rules of mathematics are collapsing, it is the end of the world!")
                    # Seriously, you don't need this check. Honest.
                    # Your safety file has never had anything in it, has it?
# Don't forget to close the output files when done.
s1.close()
s2.close()

We can even be a bit cleverer:

s1 = open(os.path.join(d, 'sss.txt'), 'w')
s2 = open(os.path.join(d, 'sss.txt'), 'w')
outputfiles = [s1, s2]
for i in l:
    p = os.path.join(d, i)
    if os.path.isfile(p) and inputfilename == 'data.dat':
        with open(p, 'r') as f:
            for index, line in enumerate(f.readlines()):
                outputfiles[index % 2].write(line)
for f in outputfiles:
    f.close()

As I said, I haven’t tested this code, so it may contain typos, bugs or completely do the wrong thing. Good luck!

1 Like

Thank you so so much for showing the way! This helped a lot!

You can open two files with one context manager. And then you don’t need to close them by hand.

#I love pathlib
from pathlib import Path
d = Path(d)

# you do want to give them different names, yes?
with (open(d / 'sss.txt', 'w') as s1,
      open(d / 'sss2.txt', 'w') as s2):
    outputfiles = [s1, s2]
    for i in l:  # i is not a good variable name for a filename (nor l for ??)
        p = d / i
        if p.isfile() and inputfilename == 'data.dat':
            with open(p, 'r') as f:
                for index, line in enumerate(f.readlines()):
                    outputfiles[index % 2].write(line)

(untested)

If you don’t want to do the context manger, yo can be a bit more clever as well:

outputfiles = (open(os.path.join(d, 'sss.txt'), 'w')
               open(os.path.join(d, 'sss.txt'), 'w'))
1 Like

TIL, thank you!

My understanding is that the benefit of using “with” is that it automatically handles closing the file when you leave the clause. Is that right?

In combination with open(), yes.

A context manager object has specific enter/exit function which the
with statement calls at the start of the clause and on leaving.

The open() function returns a file object, which is also a context
manager object. Its exit function closes the file.

In combination the with open(....) as f: construction reliably closes
the file when you leave the with clause.

Cheers,
Cameron Simpson cs@cskk.id.au