How do I use Python to read multiple JSON files and export specific values to a CSV?

purbanics · September 7, 2022, 8:04am

I am not a programmer and have no experience with Pyton, so I would really appreciate any help to solve the few remaining issues I’ll explain bellow:

What I am trying to do is collect a few but same values from many .json files and get them into a spreadsheet to work with later.
So far I can get the values I need from all JSON files in a folder, but I would also need the JSON filename and the resulting CSV is not written, I get an error.
my python file:

import json
import glob
from datetime import datetime
import csv

src = "data/"

date = datetime.now()
data = []

files = glob.glob('data/*', recursive=True)

for single_file in files:
  with open(single_file, 'r') as f:

    try:
      json_file = json.load(f)
      data.append([
        json_file['score'],
        json_file['otherValue']
      ])
    except KeyError:
      print(f'Skipping {single_file}')

data.sort()

data.insert(0, ['fileName','score','otherValue'])

print(data)
#printing works

# Export to CSV.
csv_filename = f'{str(date)}.csv'
with open(csv_filename, "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(data)

print("Updated CSV")

Thank you in advance for any help!
I think I am not far from this result just need some guidance by somebody who knows what they are doing.

rob42 · September 7, 2022, 8:17am

Possibly, it’s the way that you’ve named things. For example, I have a CSV write function that includes this:

with open(csv_file, 'a', encoding="utf8") as f:
        f.write(date)
        output = writer(f)
        output.writerow(data)

… which is appending data to the file, which already has the header names in place.

n.b: notice that the first line writes the ‘date’ and the third line writes the ‘data’; it’s easy to misread that.

edit to add:

Also I’ve used from csv import DictReader, writer, which means that I don’t need to have output = csv.writer(f), so there’s also that.

I’m convinced that you can’t use writer = as that will mean that you can’t use the writer() function from the csv import.

{Corrected for clarity}

vbrozik · September 7, 2022, 9:08am

You are iterating the file names using a loop so you need to add the file name to the every item of your data list inside the loop (inside data.append()).

purbanics · September 8, 2022, 8:26am

Thanks @vbrozik , Yes, I figured out that that is where the filename should go, but how do I get the name of the file?

vbrozik · September 8, 2022, 8:41am

The glob.glob() function returns list of file names. Your for loop iterates this list so in every iteration single_file contains the file name which is currently being processed.

purbanics · September 8, 2022, 9:59am

Thanks for your help @rob42

I tried those changes, so have now:

csv_file = f'{str(date)}.csv'
with open(csv_file, 'a', encoding="utf8") as f:
        f.write(date)
        output = writer(f)
        output.writerow(data)

But it still gives an error for that line:

Traceback (most recent call last):
File “C:\JsonLevels\Levels_json_To_csv_Exporter.py”, line 57, in
with open(csv_file, ‘a’, encoding=“utf8”) as f:
OSError: [Errno 22] Invalid argument: ‘2022-09-08 12:55:11.836601.csv’

Do you have any idea what could cause that?

purbanics · September 8, 2022, 10:00am

Thank you. Now I have the file name also. Great progress!

vbrozik · September 8, 2022, 10:04am

I am glad that you were able to make the progress!

Peter Urbanics:

Traceback (most recent call last):
File “C:\JsonLevels\Levels_json_To_csv_Exporter.py”, line 57, in
with open(csv_file, ‘a’, encoding=“utf8”) as f:
OSError: [Errno 22] Invalid argument: ‘2022-09-08 12:55:11.836601.csv’

It looks like you are running the program on Windows This system has serious limitations regarding allowed characters in file names. I think here the problem is the colon : character.

rob42 · September 8, 2022, 10:07am

No worries.

It looks to me that your date variable needs to be reformatted before you can use it as a file name, which is not impossible.

What would you like, as the file name?

edit: It looks like @vbrozik has this in hand, so I’ll side line.

purbanics · September 8, 2022, 10:08am

Well yes of course. Simple Windows peasant here.

How can I name the file anything simple enough for Windows to accept?
(It does not need to be unique like the date.)

purbanics · September 8, 2022, 10:09am

Can be anything. It does not need to be unique like the date would be.

purbanics · September 8, 2022, 10:12am

What my output is now printed by this command:

print(data)

into the console looks like:

[[‘fileName’, ‘scoreMax’], [‘data\Level2000-5360.json’, 835860], [‘data\Level2000-5361.json’, 1296420], [‘data\Level2000-5362.json’, 837220], [‘data\Level2000-5363.json’, 781170], [‘data\Level2000-5364.json’, 947670]]

I can easily, with a bit of text replacing in a text editor get that to be like
fileName, scoreMax, 5360, 835860, 5361, 1296420, 5362, 837220, 5363, 781170, 5364, 947670

This is the data I would need, so good so far.
The only thing missing is a line break after every file’s data, so when importing it into spreadsheet it becomes a table.

How could I add a line break after every file’s data added?

purbanics · September 8, 2022, 10:20am

Oh, this works.
It makes a file and puts the data into it:

csv_file = f'CSVFileName.csv'
with open(csv_file, 'a', encoding="utf8") as f:
        # f.write(date)
        output = writer(f)
        output.writerow(data)

Thank you guys for all the help!

So the only thing missing is a line break after every file’s data.
How do I add that?

rob42 · September 8, 2022, 10:29am

Off of the top of my head: output.writerow(data+'\n')?

purbanics · September 8, 2022, 10:46am

The new line needs to be in this iteration, right!?
So there is a new line after adding the data from each file.

for single_file in files:
  with open(single_file, 'r') as f:
    try:
      json_file = json.load(f)
      data.append([
        single_file,
        json_file['scoreMax']
        ])
    except KeyError:
      print(f'Skipping {single_file}')

rob42 · September 8, 2022, 10:48am

This is where the road forks, so to speak – I’m doing CSV to CSV, whereas you’re doing JSON to CSV.

I’m sure someone will know.

Glad I could help, in some small way.

Have a good day.

purbanics · September 8, 2022, 11:00am

Thanks for your help @rob42.
Have a nice day!

vbrozik · September 8, 2022, 12:01pm

If you want to use date+time anyway - I for example prefer to follow ISO 8601 when possible. You can do it by omitting the separators. Using your stored date:

date.strftime('%Y%m%dT%H%M%S')

Peter Urbanics:

csv_file = f'CSVFileName.csv'
with open(csv_file, 'a', encoding="utf8") as f:
        # f.write(date)
        output = writer(f)
        output.writerow(data)

So the only thing missing is a line break after every file’s data.

What is writer? Is not it csv.writer? Import just csv not from csv import writer. Your program will be easier to understand,

I have never used csv.writer but I suppose the writerow method should be used to write a single row while you feed it with all the data. I think you need to iterate the items (rows) of your data using for and call writerow for every row individually.

Look into the documentation to know how it should be used.

purbanics · September 16, 2022, 5:28am

Thank you all for helping!
I managed to do the thing with your help and sharing here for anybody who a similar solution:

import json
import glob
from datetime import datetime
import csv
from csv import DictReader, writer

date = datetime.now()
data = []

files = glob.glob('dataFolder/*', recursive=True)

# Loop through files
for single_file in files:
  with open(single_file, 'r') as f:

    try:
      json_file = json.load(f)
      data.append([
        single_file,
        json_file['scores'],
        json_file['otherData']
        ])
    except KeyError:
      print(f'Skipping {single_file}')

# Sort the data
# data.sort()

# Add headers
data.insert(0, ['fileName','scores','otherData'])

print(data)
#printing works

csv_file = f'CSVFileName.csv'
with open(csv_file, 'a', encoding="utf8") as f:
        # f.write(date)
        output = writer(f)
        output.writerow(data)

Topic		Replies	Views
How to transform a JSON file into a CSV one in Python? Python Help	14	194	April 9, 2024
Traverse through multiple folder/subfolders and extract Json data and store into CSV using python Python Help	14	5861	October 23, 2022
Convert json to csv using python Python Help	0	284	July 10, 2021
How to parse different part of json file (Newbie here) Python Help	13	2364	October 5, 2021
CSV to JSON - Not quite Python Help	8	1944	July 25, 2022

How do I use Python to read multiple JSON files and export specific values to a CSV?

Related Topics