Issue with scraped coordinates

Greetings. This is my first post here as I am losing my patience bit by bit. I am making a code which would create a map with a marker based on a value from SQL table which serves as a pointer for scraper to get coordinates from. Unfortunately I receive this error:


Traceback (most recent call last):
  File "D:\#######\######\######\########\main.py", line 4, in <module>
    gui()
  File "D:\#######\######\######\########\utils\functions.py", line 41, in gui
    CREATE_MAP_SELECT(nickname)
  File "D:\#######\######\######\########\utils\functions.py", line 106, in CREATE_MAP_SELECT
    city_coordinates = get_coordinates_of(city)
                       ^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:D:\#######\######\######\########\utils\functions.py", line 96, in get_coordinates_of
    response_html_latitude=(response_html.select('.latitude')[1].text)
                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

The pieces of code this error calls out to is this:

case " MAP SELECT":
                nickname = input("Pick the nickname you would like to see on the map: ")
                CREATE_MAP_SELECT(nickname)

and this

def CREATE_MAP_SELECT(nickname: str) -> None:
    def get_coordinates_of(city: str) -> list[float, float]:
        adres_URL = f'https://en.wikipedia.org/wiki/{city}'
        response = requests.get(url=adres_URL)
        response_html = BeautifulSoup(response.text, 'html.parser')

        response_html_latitude=(response_html.select('.latitude')[1].text)
        response_html_longitude = (response_html.select('.longitude')[1].text)
        response_html_latitude = float(response_html_latitude.replace(',', '.'))
        response_html_longitude = float(response_html_longitude.replace(',', '.'))
        return [response_html_latitude, response_html_longitude]

    cursor.execute(f"SELECT * FROM users WHERE nickname = %s", (nickname,))
    user = cursor.fetchone()
    if user:
        city = user[3]
        city_coordinates = get_coordinates_of(city)
        map = folium.Map(
            location=city_coordinates,
            tiles="OpenStreetMap",
            zoom_start=14,
        )
        folium.Marker(
            location=city_coordinates,
            popup=f'USER: {user[1]}, POSTS: {user[4]}'
        ).add_to(map)
        map.save(f'MAP_{user[2]}.html')
    else:
        print(f"User called {nickname} not found in the database.")

The names for the cities are correct and after fiddling around it seems like the larger piece of the code is at the problem.

This means that response_html.select('.latitude') is a list with fewer
than 2 elements (because there is no entry at index 1). Maybe the
city is unknown, so not found in the Wikipedia output?

I’d start by:

  • print out the city URL you’re using, and visit that page in a web
    browser
  • print out response_html.select('.latitude') before this line
  • print out the whole response_html if you need more context

Then have some appropriate test before that line, depending on what’s
going on.

For example, maybe something like Paris is taking you to a
disambiguation page, eg: Paris (disambiguation) - Wikipedia

Cheers,
Cameron Simpson cs@cskk.id.au

1 Like

The error tells you that when Python tries to use 1 as an index into response_html.select('.latitude'), it fails, because there are not enough elements in that result.

So, did you try to check what result you actually get from response_html.select('.latitude')? Is it as you expect?

Does the problem occur consistently, or only with specific inputs? Can you figure out a specific input that causes the problem? If it normally works but seems to break with a certain input, what happens if you try to trace the steps “as a user” - for example, by looking up the Wikipedia page yourself, looking at its HTML source etc.?

For example I decided to pick Saldus in Latvia:

And while the error pops out like usual, there is a small alteration in the comment:
ValueError: could not convert string to float: ‘56°40′N’ - so it seems like it can extract the work properly but cannot convert it, when changing the float to str, it prints just fine but falls flat when applying to the code proper.
While I picked a different town like Laredo, Texas, but without giving the state:

I got the exact same result as previously. But that doesn’t seem right as I have tried to do similar things but with towns I was sure don’t have any duplicates. So I decided to redo Laredo but inside the code instead of the isolated definition as I did previously and sure enough:

IndexError: list index out of range

And yes, before I posted here I tried to do the change 1 to 0 on index but it didn’t work.

Why are you scraping Wikipedia’s HTML? Grab the wikitext instead, so much easier to parse.

1 Like

I want to make it work at first, then I will attempt to inprove it.

OKAY, I managed to pinpoint the issue, when fetching the coordinates. I pointed to the wrong number in users and hence it went bonkers.

1 Like