I am working on an project where I shall get json results with HTTP request and send it to ElasticSiem. For doing this I will have results separated by new line
Instead, parse the JSON into Python (json.loads) and modify the
resulting structure and put it back. There are too many opportunities to
get text replacement wrong. Obvious example: the JSON contains a “}”
character in a string.
results_data=results_response.read()
results_json=json.loads(results_data.decode('utf-8'))
#Stock the results in json file with the name Nomduclient-Nom de la recherche AQ-time.json
to_replace = "\}"
with_replace = "\}]"
with open('logs/' + section_name + "-" + key + "-" +
filedate.strftime("%d%m%Y") + ".json", 'a+') as outfile:
json.dump(results_json, outfile)
outfile.write("\n")
content = outfile.read()
outfile.seek(0)
content.replace(to_replace,with_replace)
This code seems to:
decode the response data from JSON to a python value in results_json
append results_json to the file
try to read back the entire file
do a simple minded text replacement on the entire file
(missing) write the file back
This has several problems.
you don’t read from the start of the file, so you’ll likely read
nothing then overwrite the whole file with nothing
it will scale very badly as the file grows
if the file is newline delimited JSON, you’re not writing newline
delimited JSON with your json.dump() call - you need to supply
custom separators to prevent the default multiline output
the content.replace(to_replace,with_replace) call is too simple
minded - there is a lot of scope for mangling the wrong data
a file opened in append mode cannot be rewritten with that file
handle; all writes will append, regardless of what you do with seek()
If you’re just wanting to write newline delimited JSON, just use the
right delimiters with json.dump(). You don’t need anything else. See
the “Compact encoding” example in the json module docs: https://docs.python.org/3/library/json.html#module-json