Please have a look into py and json files in the link below:
https://filetransfer.io/data-package/2FfG8v0N
How can I modify for loop code in py file to scrape json file completely?
Thanks for all.
Please have a look into py and json files in the link below:
https://filetransfer.io/data-package/2FfG8v0N
How can I modify for loop code in py file to scrape json file completely?
Thanks for all.
Please put your Python program here between triple backticks like this:
```
# Your program will be here.
```
Otherwise it is hard to talk about it and we are also lessening the benefit for other readers of this discussion.
Put the complete traceback with the error message here too (also between triple backticks).
If you are able to make the JSON reasonably short and still valid and still making your program fail the same way. Put the JSON here too.
sure, @vbrozik
thanks for response.
here it’s the wrong loop code i want to fix.
import json
with open("18july.json", encoding="UTF-8") as f:
jsondata = json.load(f)
country = jsondata["Value"][0]["CN"]
league = jsondata["Value"][0]["L"]
date = jsondata["Value"][0]["S"]
hometeam = jsondata["Value"][0]["O1"]
awayteam = jsondata["Value"][0]["O2"]
odds1 = jsondata["Value"][0]["E"][0]["C"]
oddsX = jsondata["Value"][0]["E"][1]["C"]
odds2 = jsondata["Value"][0]["E"][2]["C"]
over = jsondata["Value"][0]["E"][8]["C"]
gline = jsondata["Value"][0]["E"][8]["P"]
under = jsondata["Value"][0]["E"][9]["C"]
for game in jsondata["Value"]:
country = game["CN"]
league = game["L"]
date = game["S"]
hometeam = game["O1"]
awayteam = game["O2"]
odds1 = game["E"][0]["C"]
oddsX = game["E"][1]["C"]
odds2 = game["E"][2]["C"]
over = game["E"][8]["C"]
gline = game["E"][8]["P"]
under = game["E"][9]["C"]
print(country, league, date, hometeam, awayteam, odds1, oddsX, odds2, over, gline, under)
Thank you for showing the code here. The subject of this topic suggests that the program ends with an error. Could you please show the complete error message? Include all the lines of the error message - it is called traceback.
Also how many iterations does the for
loop perform before it fails? I.e. how many lines does the print()
write? I think it would be easiest if you copy the complete text from your console - including the program invocation, the output (cut out its middle lines, if it is too long) and the error message.
Brazil Brazil. Campeonato Brasileiro Serie A 1658091600 Atletico Clube Goianiense Fortaleza EC 2.13 3.22 3.76 2.21 2.5 1.74
Brazil Brazil. Campeonato Brasileiro Serie A 1658091600 Botafogo de Futebol e Regatas Clube Atletico Mineiro 4.58 3.48 1.8 2.35 2.5 1.66
Brazil Brazil. Campeonato Brasileiro Serie A 1658095200 America Minas Gerais Clube Atletico Bragantino 2.74 3.14 2.73 2.23 2.5 1.72
United States USA. MLS 1658091600 New York Red Bulls New York City 2.28 3.6 3.025 1.82 2.5 2.09
United States USA. MLS 1658100600 Columbus Crew Cincinnati 1.81 3.92 4.18 1.75 2.5 2.19
United States USA. MLS 1658104200 Nashville Los Angeles 2.52 3.5 2.72 1.91 2.5 1.99
United States USA. MLS 1658107800 Real Salt Lake Sporting Kansas City 1.87 3.68 4.14 2.02 2.5 1.88
United States USA. MLS 1658107800 San Jose Earthquakes Houston Dynamo 1.97 3.98 3.48 2.375 3.5 1.65
United States USA. MLS 1658111400 Portland Timbers Vancouver Whitecaps 1.7 4.2 4.56 1.58 2.5 2.32
Argentina Argentina. Primera Division 1658091600 Club Atletico Tigre Estudiantes de La Plata 2.28 3.26 3.28 2.26 2.5 1.71
Argentina Argentina. Primera Division 1658100600 Velez Sarsfield River Plate Buenos Aires 3.188 3.11 2.38 2.44 2.5 1.62
Mexico Mexico. Liga MX 1658095200 Atletico San Luis Monterrey 3.26 3.46 2.19 2.02 2.5 1.88
Mexico Mexico. Liga MX 1658102400 Tigres de la UANL Tijuana 1.432 4.5 6.65 1.82 2.5 2.08
Traceback (most recent call last):
File "c:\Users\monst\Downloads\python_files\tryjsonutf8.py", line 30, in <module>
under = game["E"][9]["C"]
IndexError: list index out of range
13 lines it performed before it fails.
In 14th iteration, there is no game[“E”][8] and game[“E”][9] so it stopped to work.
I deleted three objects from print “over, gline, game”.
It worked but, then it stopped to work for another object awayteam = game[“O2”] too in another iteration.
So for all 11 objects I want the code continue to work even what that object not exists.
It failed at under = game["E"][9]["C"]
which is after gline = game["E"][8]["P"]
so the list game["E"]
of the 14th record must contain the index 8
. Note that Python indexes lists from 0
so game["E"][9]
refers to the 10th item of the list.
It looks like you understand what is going on so now you need to:
18july.json
file is OK or faulty).For 3. you have many options - for example:
File is OK @vbrozik , so I need to fix my program, but I don’t know how to do it.
I would prefer Option 4.
Missing fields can be replaced as “N/A”
Bcuz I want to process whole output in the excel for statistical purposes.
Here is how it can be done (it is a functional code). I will comment it later.
from __future__ import annotations
import json
import contextlib
from typing import Any, Mapping
class Game:
country: str
league: str
date: int
...
gline: float|None
under: float|None
@classmethod
def from_json(cls, json_record: Mapping[str, Any]) -> Game:
game = cls()
game.country = json_record["CN"]
game.league = json_record["L"]
game.date = json_record["S"]
...
game.gline = None
game.under = None
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.gline = game_e[8].get("P")
with contextlib.suppress(IndexError):
game.under = game_e[9].get("C")
return game
def __str__(self) -> str:
return ', '.join(
str(item) for item in (
self.country, self.league, self.date, ...,
self.gline, self.under))
with open("18july.json", encoding="UTF-8") as f:
jsondata = json.load(f)
games = [Game.from_json(game) for game in jsondata["Value"]]
for game in games:
print(game)
i tried to edit your code for 11 items like this:
from __future__ import annotations
import json
import contextlib
from typing import Any, Mapping
class Game:
country: str
league: str
date: int
hometeam: str|None
awayteam: str|None
odds1: float|None
oddsX: float|None
odds2: float|None
over: float|None
gline: float|None
under: float|None
@classmethod
def from_json(cls, json_record: Mapping[str, Any]) -> Game:
game = cls()
game.country = json_record["CN"]
game.league = json_record["L"]
game.date = json_record["S"]
game.hometeam = None
game.awayteam = None
game.odds1 = None
game.oddsX = None
game.odds2 = None
game.over = None
game.gline = None
game.under = None
if game_a := json_record.get:
with contextlib.suppress(IndexError):
game.hometeam = game_a.get("O1")
with contextlib.suppress(IndexError):
game.awayteam = game_a.get("O2")
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
with contextlib.suppress(IndexError):
game.oddsX = game_e[1].get("C")
with contextlib.suppress(IndexError):
game.odds2 = game_e[2].get("C")
with contextlib.suppress(IndexError):
game.over = game_e[8].get("C")
with contextlib.suppress(IndexError):
game.gline = game_e[8].get("P")
with contextlib.suppress(IndexError):
game.under = game_e[9].get("C")
return game
def __str__(self) -> str:
return ', '.join(
str(item) for item in (
self.country, self.league, self.date, self.hometeam, self.awayteam,
self.odds1, self.oddsX, self.odds2, self.over, self.gline, self.under))
with open("18july.json", encoding="UTF-8") as f:
jsondata = json.load(f)
games = [Game.from_json(game) for game in jsondata["Value"]]
for game in games:
print(game)
but it gave error like this:
Traceback (most recent call last):
File "c:\Users\monst\Downloads\python_files\vcolav.py", line 64, in <module>
games = [Game.from_json(game) for game in jsondata["Value"]]
File "c:\Users\monst\Downloads\python_files\vcolav.py", line 64, in <listcomp>
games = [Game.from_json(game) for game in jsondata["Value"]]
File "c:\Users\monst\Downloads\python_files\vcolav.py", line 36, in from_json
game.hometeam = game_a.get("O1")
AttributeError: 'builtin_function_or_method' object has no attribute 'get'
you know there are 11 items;
Three items (country, league, date) always can be found in json file
but rest 8 items can be missing, so they have to be replaced “None”
Eight items (homeaway, awayteam, odds1, oddsX, odds2, over, gline, under)
Here is the mistake:
if game_a := json_record.get:
Instead of calling the method get
you assign it to the variable. It should probably be json_record.get("A")
?
Now the explanation I promised:
You wrote that you want to process the data. Suitable structure to put data like this together is a class. You can later add methods to the class for some parts of the processing.
class Game:
These are type annotations. They are not used during the program execution but they are useful as a part of documentation, they can help while writing the program in IDE (like VS Code) and they help to check the program for example using mypy
.
country: str
league: str
date: int
hometeam: str|None # This says that the variable can be string or None.
Class methods are often used as an alternative initializer of an object. Here we create the object from a JSON structure so I wrote it as a class method. Again there are type annotations which are not necessary:
@classmethod
def from_json(cls, json_record: Mapping[str, Any]) -> Game:
This will raise KeyError if the key "S"
is not present. If you want to allow missing key use json_record.get("S")
instead. get()
returns None
when the key is missing. If the key should be present it is better if the program fails sooner rather than later. It is then easier to analyze the problem.
game.date = json_record["S"]
Here we set the default values because later when the value setting fails we want to have the default there.
game.hometeam = None
The following code I wrote as an example how to handle the situation when the key "E"
is missing or when it contains an empty container. If the key should be present, get rid of the if
and use for example game_e = json_record["E"]
instead.
if game_e := json_record.get("E"):
Here the context manager contextlib.suppress(IndexError)
suppresses the exception IndexError
. So the code continues when the enclosed statement fails with this exception.
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
This method defines how the object converts to string when you call str(your_object)
. This is done automatically when you do print(your_object)
.
def __str__(self) -> str:
This reads the records from the JSON data to a list of Game
objects.
games = [Game.from_json(game) for game in jsondata["Value"]]
Later you will do your processing on this list.
Thanks a lot Vaclav.
I have occupied your all day, I’m sorry really.
All these detailed explanation you provided for me is really impossible to understand.
I’m an idiot.
Dismiss further steps, I still couldn’t be able to get even a proper output.
I couldn’t modify your code successfully.
It is giving an error all the time.
Your program contained just the one mistake. Here it is fixed. I have also removed the type annotations which is an advanced concept.
import json
import contextlib
class Game:
@classmethod
def from_json(cls, json_record):
game = cls()
game.country = json_record["CN"]
game.league = json_record["L"]
game.date = json_record["S"]
game.hometeam = None
game.awayteam = None
game.odds1 = None
game.oddsX = None
game.odds2 = None
game.over = None
game.gline = None
game.under = None
if game_a := json_record.get("A"):
with contextlib.suppress(IndexError):
game.hometeam = game_a.get("O1")
with contextlib.suppress(IndexError):
game.awayteam = game_a.get("O2")
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
with contextlib.suppress(IndexError):
game.oddsX = game_e[1].get("C")
with contextlib.suppress(IndexError):
game.odds2 = game_e[2].get("C")
with contextlib.suppress(IndexError):
game.over = game_e[8].get("C")
with contextlib.suppress(IndexError):
game.gline = game_e[8].get("P")
with contextlib.suppress(IndexError):
game.under = game_e[9].get("C")
return game
def __str__(self):
return ', '.join(
str(item) for item in (
self.country, self.league, self.date, self.hometeam, self.awayteam,
self.odds1, self.oddsX, self.odds2, self.over, self.gline, self.under))
with open("18july.json", encoding="UTF-8") as f:
jsondata = json.load(f)
games = [Game.from_json(game) for game in jsondata["Value"]]
for game in games:
print(game)
You should go through some beginner course. You will need it anyway to process the data. Few years ago I used the Sololearn beginner course.
You will learn Python, it just needs to invest the time and start playing with simple problems first.
No @vbrozik , it’s still not solved.
here is the output:
Brazil, Brazil. Campeonato Brasileiro Serie A, 1658091600, None, None, 2.13, 3.22, 3.76, 2.21, 2.5, 1.74
Brazil, Brazil. Campeonato Brasileiro Serie A, 1658091600, None, None, 4.58, 3.48, 1.8, 2.35, 2.5, 1.66
Brazil, Brazil. Campeonato Brasileiro Serie A, 1658095200, None, None, 2.74, 3.14, 2.73, 2.23, 2.5, 1.72
United States, USA. MLS, 1658091600, None, None, 2.28, 3.6, 3.025, 1.82, 2.5, 2.09
United States, USA. MLS, 1658100600, None, None, 1.81, 3.92, 4.18, 1.75, 2.5, 2.19
United States, USA. MLS, 1658104200, None, None, 2.52, 3.5, 2.72, 1.91, 2.5, 1.99
United States, USA. MLS, 1658107800, None, None, 1.87, 3.68, 4.14, 2.02, 2.5, 1.88
United States, USA. MLS, 1658107800, None, None, 1.97, 3.98, 3.48, 2.375, 3.5, 1.65
United States, USA. MLS, 1658111400, None, None, 1.7, 4.2, 4.56, 1.58, 2.5, 2.32
Argentina, Argentina. Primera Division, 1658091600, None, None, 2.28, 3.26, 3.28, 2.26, 2.5, 1.71
Argentina, Argentina. Primera Division, 1658100600, None, None, 3.188, 3.11, 2.38, 2.44, 2.5, 1.62
Mexico, Mexico. Liga MX, 1658095200, None, None, 3.26, 3.46, 2.19, 2.02, 2.5, 1.88
Mexico, Mexico. Liga MX, 1658102400, None, None, 1.432, 4.5, 6.65, 1.82, 2.5, 2.08
as you see, hometeam and awayteam variables not defined properly
rest 9 are OK.
please check this once more:
country = jsondata[“Value”][0][“CN”]
league = jsondata[“Value”][0][“L”]
date = jsondata[“Value”][0][“S”]
hometeam = jsondata[“Value”][0][“O1”]
awayteam = jsondata[“Value”][0][“O2”]
odds1 = jsondata[“Value”][0][“E”][0][“C”]
oddsX = jsondata[“Value”][0][“E”][1][“C”]
odds2 = jsondata[“Value”][0][“E”][2][“C”]
over = jsondata[“Value”][0][“E”][8][“C”]
gline = jsondata[“Value”][0][“E”][8][“P”]
under = jsondata[“Value”][0][“E”][9][“C”]
Compare these two accesses.
Your original code:
...
awayteam = jsondata["Value"][0]["O2"]
odds1 = jsondata["Value"][0]["E"][0]["C"]
...
The new code:
json_record = jsondata["Value"][item_index]
# This statement just illustrates what json_record contains.
# In the for loop item_index sequentially goes through all the indexes.
...
if game_a := json_record.get("A"):
# The statement above retrieves jsondata["Value"][item_index]["A"]
# Which does not correspond to your original code.
with contextlib.suppress(IndexError):
game.awayteam = game_a.get("O2")
...
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
To correspond to your original code you need to use code like this:
game.awayteam = game["O2"]
...
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
Or this one if you want to tolerate missing key "O2"
:
game.awayteam = game.get("O2")
...
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
IndexError
error happens only when accessing lists using a_list[a_index]
when the index is out of range. There is no need to suppress it for game.awayteam = game.get["O2"]
.
latest code you sent me this:
import json
import contextlib
class Game:
@classmethod
def from_json(cls, json_record):
game = cls()
game.country = json_record["CN"]
game.league = json_record["L"]
game.date = json_record["S"]
game.hometeam = None
game.awayteam = None
game.odds1 = None
game.oddsX = None
game.odds2 = None
game.over = None
game.gline = None
game.under = None
if game_a := json_record.get("A"):
with contextlib.suppress(IndexError):
game.hometeam = game_a.get("O1")
with contextlib.suppress(IndexError):
game.awayteam = game_a.get("O2")
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
with contextlib.suppress(IndexError):
game.oddsX = game_e[1].get("C")
with contextlib.suppress(IndexError):
game.odds2 = game_e[2].get("C")
with contextlib.suppress(IndexError):
game.over = game_e[8].get("C")
with contextlib.suppress(IndexError):
game.gline = game_e[8].get("P")
with contextlib.suppress(IndexError):
game.under = game_e[9].get("C")
return game
def __str__(self):
return ', '.join(
str(item) for item in (
self.country, self.league, self.date, self.hometeam, self.awayteam,
self.odds1, self.oddsX, self.odds2, self.over, self.gline, self.under))
with open("18july.json", encoding="UTF-8") as f:
jsondata = json.load(f)
games = [Game.from_json(game) for game in jsondata["Value"]]
for game in games:
print(game)
i dont see any line like this inside it
json_record = jsondata["Value"][item_index]
only this part is problematic in the code
if game_a := json_record.get("A"):
with contextlib.suppress(IndexError):
game.hometeam = game_a.get("O1")
with contextlib.suppress(IndexError):
game.awayteam = game_a.get("O2")
Please read my comments carefully. As I tried to explain in the comment I added the assignment to give the necessary context to the following code. That is what I meant by: “This statement illustates…”
For me it is difficult to express all nuances in English and I know that parts of my text are hard (or impossible) to understand.
Edit: Now I see my horrible mistakes. I have edited my previous post with aim to fix them and improve the readability. Please let me know if my English is still unclear.
The previous post contains the solution. You really should go through a Python introductory course. You need to understand lists, dictionaries etc. to continue working on your program.
If there are specific parts of the code you want me to explain, show them.
game.awayteam = game["O2"]
...
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
game.awayteam = game.get("O2")
...
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
Why are you diligently to write these two parts partially?
All code all about only 25-30 lines @vbrozik
Why don’t you submit fixed final version of whole code?
It’s not about going to course or taking course.
Sorry, but you’re playing with me like you playing with kitten.
You don’t intend to help me.
def from_json(cls, json_record):
game = cls()
game.country = json_record["CN"]
game.league = json_record["L"]
game.date = json_record["S"]
game.hometeam = json_record["O1"]
game.awayteam = json_record["O2"]
game.odds1 = None
game.oddsX = None
game.odds2 = None
game.over = None
game.gline = None
game.under = None
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
with contextlib.suppress(IndexError):
game.oddsX = game_e[1].get("C")
with contextlib.suppress(IndexError):
game.odds2 = game_e[2].get("C")
with contextlib.suppress(IndexError):
game.over = game_e[8].get("C")
with contextlib.suppress(IndexError):
game.gline = game_e[8].get("P")
with contextlib.suppress(IndexError):
game.under = game_e[9].get("C")
return game
this gives error
def from_json(cls, json_record):
game = cls()
game.country = json_record["CN"]
game.league = json_record["L"]
game.date = json_record["S"]
game.hometeam = game["O1"]
game.awayteam = game["O2"]
game.odds1 = None
game.oddsX = None
game.odds2 = None
game.over = None
game.gline = None
game.under = None
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
with contextlib.suppress(IndexError):
game.oddsX = game_e[1].get("C")
with contextlib.suppress(IndexError):
game.odds2 = game_e[2].get("C")
with contextlib.suppress(IndexError):
game.over = game_e[8].get("C")
with contextlib.suppress(IndexError):
game.gline = game_e[8].get("P")
with contextlib.suppress(IndexError):
game.under = game_e[9].get("C")
return game
this gives error too
def from_json(cls, json_record):
game = cls()
game.country = json_record["CN"]
game.league = json_record["L"]
game.date = json_record["S"]
game.hometeam = game.get["O1"]
game.awayteam = game.get["O2"]
game.odds1 = None
game.oddsX = None
game.odds2 = None
game.over = None
game.gline = None
game.under = None
if game_e := json_record.get("E"):
with contextlib.suppress(IndexError):
game.odds1 = game_e[0].get("C")
with contextlib.suppress(IndexError):
game.oddsX = game_e[1].get("C")
with contextlib.suppress(IndexError):
game.odds2 = game_e[2].get("C")
with contextlib.suppress(IndexError):
game.over = game_e[8].get("C")
with contextlib.suppress(IndexError):
game.gline = game_e[8].get("P")
with contextlib.suppress(IndexError):
game.under = game_e[9].get("C")
return game
this edit gives error too.