Help me with looping

this is my current code: `import pandas as pd
import json

Read two CSV files

file1 = pd.read_csv(‘tblsample1.csv’)
file2 = pd.read_csv(‘tblsample.csv’)

Merge based on a common column ‘slotWinnerSettingDetailId’

merged_df = pd.merge(file1, file2, on=‘slotWinnerSettingDetailId’, how=‘inner’)

print(“Merged DataFrame:”)
print(merged_df)

def convert_to_json(df, json_file):
# Prepare data list to store JSON entries
data =

for _, row in df.iterrows():
    entry = {
        "operatorid": row['operatorid'],
        "productcode": row['productcode'],
        "gamecode": row['gamecode'],
        "type": row['type'],
        "membercode" : row['membercode'],
        "platform": [platform.strip() for platform in row['platform'].split(',')],
        "winDate": row['winDate'],
        "amount": [
              {"cur": row['currencycode'].strip(), "value": int(row['amount'.strip()])}]
    }
    data.append(entry)

with open(json_file, mode='w', encoding='utf-8') as jsonf:
    jsonf.write(json.dumps(data, indent=4))

csvFilePath = ‘merged_data.csv’
jsonFilePath = ‘merged_data.json’

Convert the merged DataFrame to JSON

convert_to_json(merged_df, jsonFilePath)

print(f"JSON data written to ‘{jsonFilePath}’.")
`

this is the ouput:
{ "operatorid": 1, "productcode": "sample", "gamecode": "sample", "type": 1, "membercode": "samplemember0000", "platform": [ "web", "mobile" ], "winDate": "2024", "amount": [ { "cur": "UST", "value": 9724 } ] }, { "operatorid": 1, "productcode": "sample", "gamecode": "sample", "type": 1, "membercode": "sample11111", "platform": [ "web", "mobile" ], "winDate": "2024", "amount": [ { "cur": "USD", "value": 9724 } ] },

what I want is to look like this format in json file : but instead of this looking like this, it also loops the membercode

Please read the pinned thread and edit the post so that the code can be read properly. Indentation is crucial in Python, so otherwise it’s hard to be sure we understand what you’re actually trying and therefore where the problem might be.

This piece of code:

 "amount": [
        {"cur": row['currencycode'].strip(), "value": int(row['amount'.strip()])}
 ]

Inherently puts exactly one amount in the amount list, and you’re
accruing a list of these single-amount entries in the data list.

Your example output has several amounts. That implies that there should
maybe be an entry per membercode, and that you should be appending
amounts to its "amount" list on every row.

That suggests that maybe you should prepare a dict keyed on the
membercode (if that is what you’re supposed to collate) with an
entry per member. It would start empty before the loop. Then inside the
loop you’d make an entry if there isn’t one already, and then append the
amount record to the "amount" list in the entry.

After the loop you’d writeout your JSON file.

1 Like

Creating records from the rows by iterating over Pandas rows isn’t necessary, and it’s also much less efficient, as say, column-based or frame based methods. You can dig into the internals a bit if you read this, just to give one example.

An efficient way of writing a dataframe to (properly indented) JSON is to use the dataframe built-in to_json method with the orient='records' option - there’s no anchor to the docs example for the orient='records' option, so so you’ll see the example if you scroll down a little.

...
from pathlib import Path

with Path('mydata.json').open(mode='w') as file:
    merged_df.to_json(file, orient='records', indent=4)
    file.flush()

Your output should be a properly indented JSON array (of records), something like:

[
	{
		"operatorid": 1,
	    "productcode": "sample",
	    "gamecode": "sample",
	    "type": 1,
	    "membercode": "samplemember0000",
	    "platform": [
	    	"web",
	    	"mobile"
	    ],
	    "winDate": "2024",
	    "amount": [
			{
	        	"cur": "UST",
	         	"value": 9724
	        }
	    ]
	},
	{
		"operatorid": 1,
		"productcode": "sample",
		"gamecode": "sample",
		"type": 1,
		"membercode": "sample11111",
		"platform": [
			"web",
			"mobile"
		],
		"winDate": "2024",
		"amount": [
			{
				"cur": "USD",
				"value": 9724
			}
		]
	}
]	

If you’re dealing very large dataframes then to_json proves a useful compression option.

P. S. to_json(orient='records',...) lets Pandas do the work of creating JSON records from the rows, and thus there’s no need to iterate over rows.