Appending JSON to same file

Hi,

I need to make that within each request from api new dict will be appended to json. Now I have that each time data overwritten, but I need to append to existing file. How to achieve this?
Code:(import requestsimport jsonimport timeimport csvimport pandas start=2 - Pastebin.com)

Appending to a file means seeking to the end of the file and writing more data to it from that location. This is not possible with JSON files since the start and end of data is marked by {}. Appending additional data after the final } will result in a malformed JSON file.

What it sounds like you want to do is to create a delta of the original file and the new string representation, seek to the first location where they differ, and overwrite the original file from that location. Why do you think that you need to do that? It will be slower than just rewriting the whole file.

I need to do this in order to keep history of records. I am using API, my python code runs everyday so each requests generates new data. Main problem is that requests number is limited it is 9802. So if I make more than 9800 then code crashes. So I thought to once make 9800 requests and then append data each day to existing file. I need append data to existing json not at the end. I used this code:
(import osimport requestsimport jsonimport sysRESULT_FILE = "C:/Users - Pastebin.com), but all data was formed in one row, so it is wrong, because 4 columns were expected.

You can update a dictionary like this:

import json

with open("mydata.json", "r") as jsonfile:
    data = json.load(jsonfile)

new_data = ...  # Download new data from your API.
data.update(new_data)
# Alternatively, if the outmost data structure in the JSON file is a list:
# data.append(new_data)

with open("mydata.json", "w") as jsonfile:
    json.dump(data, jsonfile)

This will still rewrite the whole file every time it runs, but old data will be preserved. Does that meet your needs?

Like once I saved json in file. Each day within requests new data should be appended to existing json.

Could you help me fix code that I could get normal structure of file. Having 4 columns as expected:

If you post the code you are having problems with someone may be able to help you. Only post enough code to demonstrate your problem. Do not just post a link to all of your code on some external site.

JSON doesn’t have columns, so I have no idea what data your picture is showing.

Appending to a file means seeking to the end of the file and writing
more data to it from that location.

You can just open(filename, "a") to do this (append mode).

This is not possible with JSON files since the start and end of data is
marked by {}.

Technically, a JSON representation of a dict is like that. This is
also legal JSON [1,2,3] and so is this "foo".

Appending additional data after the final } will result in a
malformed JSON file.

A popular format for the OP’s problem is newline delimited JSON, aka
NDJSON. This is a text file with each line containing a single-line JSON
record. Eg:

 { "a": 1, "b": 2 }
 { "a": 3, "b": 4 }
 { "a": 6, "b": 7 }

which would suit the OP well, from the sound of it. In which case a
simple:

 with open("foo.ndjson", "a") as jf:
     print(json.dumps(object_goes_here, separators=(',',':')), file=jf)

will do exactly what they want.

Cheers,
Cameron Simpson cs@cskk.id.au

1 Like

Yes, and what that does under the hood is to seek to the end of the file and start writing from there, no?

Fair.

Huh. I did not know that was legal JSON. TIL, thanks.

It’s not. It’s a different format, which uses newlines specifically - and ONLY - to delimit otherwise-valid JSON documents. A JSON parser will not accept NDJSON. But as a different format, it has different use-cases.

1 Like

So I still don’t get how fix my code that I could be able to make appending?

with open("C:/Users/apskaita3/Finansų analizės ir valdymo sprendimai, UAB/Rokas Toomsalu - Power BI analitika/Integracijos/1_Public comapnies analytics/Databasesets/Others/market_news_helsinki.json", "w") as outfile:    
    json_object = json.dumps({"item": list(results["item"].values())}, indent = 4)
    outfile.write(json_object)

So when I changed w to a, then new item was added and structure was malformed. So how to fix my code then?

I tried your but it didn’t help it added at the end of file and it became malformed

No. On UNIX/POSIX it opens the file with O_APPEND, and the OS
guarrentees that all writes go to the end of the file. No userland
seeks involved at all. In fact, I’m pretty sure userland seeks are
futile in this mode.

Cheers,
Cameron Simpson cs@cskk.id.au

1 Like

Actually still I am stuck.

Without seeing your code or the contents of the file or the error
messages we can’t tell what may or may not be wrong.

Also, how are you testing that it is malformed?

Remember that an NDJSON file is not a file-of-text containing a single
JSON value. It is a file-of-text with many records, each record a single
line of text, each line containing a distinct JSON value.

The format you use for the file affects how you must read the file.

Your original post suggested that you may want to append a new record to
the end of a file containing JSON data. NDJSON is one approach to doing
that, but it inherently makes the file have many separate records, each
a piece of JSON on its own.

The approaches where the file is a single JSON record tend to involve
overwriting the file, because the JSON record spans the entire content
of the file. If you had strong control over the layout of the JSON you
could possibly overwrite just the end of the record with additional
information, but it would be quite fiddly and possible error prone.

Cheers,
Cameron Simpson cs@cskk.id.au

Code is here:

import requests
import json
import time
import csv
import pandas
 
start=250
 
with open('C:/Users/apskaita3/Desktop/number2.txt', "r") as f:

    start= f.readlines()
 
start=int(start[0])
 
start=start + 70


results = {"item": {}}

# Todo load json
for i in range(0,150): #<----- Just change range here to increase number of requests
    URL = f"https://api.news.eu.nasdaq.com/news/query.action?type=handleResponse&showAttachments=true&showCnsSpecific=true&showCompany=true&countResults=false&freeText=&company=&market=Main%20Market%2C+Helsinki&cnscategory=&fromDate=&toDate=&globalGroup=exchangeNotice&globalName=NordicMainMarkets&displayLanguage=en&language=en&timeZone=CET&dateMask=yyyy-MM-dd+HH%3Amm%3Ass&limit=50000000&start={i}&dir=ASC"
    r = requests.get(url = URL)
    #time.sleep(1)
    res = r.text.replace("handleResponse(", "")
    #print(res)
    #print(f'r is {r}')
    res_json = json.loads(res)
    #print(res_json)
    data = res_json
    a=i+1
    #print(data)
    print("Doing: " + str(i + 1) + "th")
    #data = r.json()
    
    downloaded_entries = data["results"]["item"]
    new_entries = [d for d in downloaded_entries if d["headline"] not in results["item"]]
    start=str(start)
    
    for entry in new_entries:
        if entry["market"] == 'Main Market, Helsinki' and entry["published"]>="2021-10-20 06:30:00":
            headline = entry["headline"].strip()
            published = entry["published"]
            market=entry["market"]
            market="Main Market, Helsinki"
            results["item"][headline] = {"company": entry["company"], "messageUrl": entry["messageUrl"], "published": entry["published"], "headline": headline}
            print(entry['market'])
            #time.sleep(5)
            print(f"Market: {market}/nDate: {published}/n")
            #print( results["item"][headline] )
            #print(results)
            #print(json.dumps({"item": list(results["item"].values())}, indent = 4))
            
with open("C:/Users/apskaita3/Finansų analizės ir valdymo sprendimai, UAB/Rokas Toomsalu - Power BI analitika/Integracijos/1_Public comapnies analytics/Databasesets/Others/market_news_helsinki0000.json", "a") as outfile:    
    json_object = json.dumps({"item": list(results["item"].values())}, indent = 4,separators=(',',':'))
    outfile.write(json_object)
    #print(json_object)
with open("C:/Users/apskaita3/Desktop/number2.txt", "w") as outfile1:    
    outfile1.write(start)  # type: ignore