Completely agree with @MRAB’s responses.
Firstly ask: what is the “key” which links the entries in the two files? (not path!)
Secondly, (ref earlier Qu about file-size) read the smaller file and (start to) populate the merged_content
dictionary, line-by-line. Given that the data appears to have a key:value format, remember that it is possible to “nest” dictionaries. Thus, each line/row/record in merged_content could consist of a dict, eg
{ item1: { path: "etc/1log.text", desc="description", value=5.00, more... },
item2: { ... },
etc,
}
Yes, in some cases there may not be a label, eg the description, but forming such a structure is likely to help in designing this collection-phase and whatever comes next!
Thirdly, having built a basic data-structure, read through the longer file. Follow the rules outlined above and ignore the record if the input doesn’t match. If it does match, add the data to the appropriate entry (nested-dict) in merged_content
.
Lastly, create file c.txt by iterating through merged_content
- but check that elements from both file-a and file-b appear, eg (apparently defined as: the inner-dict has keys for both path
and value
) and discard/ignore irrelevant entries.
By splitting-up the task into functional-elements (yes, they could be coded as individual functions), each step can be tested in-isolation. Thus, it becomes obvious where any fault lies. (not that we make mistakes!)
Once the system is working (correctly), yes, it will become apparent that some steps could be combined. However, KISS-principle applies, or in IT-philosophy “make it work, before you make it better” and “premature optimisation is the root of all evil”…
NB if the files were both sorted by key (itemNR) then the job could be done in a single “pass”. That was the way we performed a lot of data-processing back in the ?good, old, mainframe days - hence that question. So, when you’ve finished tinkering with the code; consider the question of data-formatting (if the data were otherwise organised, ie “designed” would it make the coding easier?): might it be quicker to sort both files first, and then merge them?
Thus, some up-front thinking and design may save coding-time/complexity! OTOH once the coding-job has been done…