Reading only lines that are needed and not the ones that are not needed

I have an xml file that is 1000 lines long, I need to replace all Mathew with Kasper in that xml file. I am able to do that successfully if I read all lines of the xml file(test1.xml) and write the contents to another file(test2.xml) but If I dont want to read all the lines rather read only those lines in which there is Mathew and and then only update that content in the file is that possible? This is my latest code that works but it reads everyline.

import re

def main():
    #Location to the xml file
    XMLFile = "test1.xml"
    COPYFILE = "test2.xml"
    
    content = open(XMLFile, "r")

    with open(COPYFILE, 'w') as write_file:
        # Read all lines in the file one by one
        for line in content:
            # For each line, check if line contains the word
            if 'and' in line:
                print(line)
                line = line.replace("Mathew", "Kasper")
            write_file.write(line)
            print(line)
            
if __name__ == "__main__":
    main()

I can add an if statement: If 'and ’ in line: but then also I would have to write every line of the file content to the new file, isnt it possible to use the same file test1.xml and only update that file by only updating the needed changes and not touching anything else and let the remaining content stay as is.

So far as I know, that’s not possible, because of the sequential nature of the file read/write. To do do what you’re suggesting would require a random access read/write; like I say, so far as I know.

Refer the following links to understand file handling:

So, you don’t want to read all of the lines, but only those lines that contain “Mathew”?

How do you know whether a line contains “Mathew”?

You’re going to have to read it!

You’re really better off just doing what you’re already doing.

But lets say you have to do this for a file with 1 million lines, then it will be very very time consuming right?

What you are doing will mostly work for the example you show.
But in the general case you should parse the input XML file into a DOM and process the text in the DOM. Then save the DOM as an XML file.

What if the string you replace is a TAG name? In that case you will corrupt the XML.

1 Like