Excel image scrapping. error

My main object is to search google for numbers in colum C of excel. and paste the first image on google into colum D.

With the help of chatgpt. i get this script.
i have installed python 3.11 and the butifulsoup. i am using windowws 11.

the script runs, and saves a new exel file. but nothing is stored in colum D.

Main script

import openpyxl
import requests
from bs4 import BeautifulSoup

# Load the Excel workbook
excel_file_path = r"C:\Users\Rune\Documents\testbilde.xlsx"
workbook = openpyxl.load_workbook(excel_file_path)
sheet = workbook.active

# Iterate through values in column C and process each search
for row in sheet.iter_rows(min_row=2, min_col=3, max_col=3, values_only=True):
    search_term = row[0]
    search_url = f"https://www.google.com/search?q={search_term}&tbm=isch"

    # Send a request to Google and parse the page with BeautifulSoup
    response = requests.get(search_url)
    soup = BeautifulSoup(response.text, "html.parser")

    # Find the first image result (if any)
    image_results = soup.select(".rg_i")
    if image_results:
        first_image_url = image_results[0]["data-src"]

        # Insert the image URL into the adjacent cell in column D
        sheet.cell(row=row[0].row, column=4).value = first_image_url

# Save the updated Excel workbook
updated_excel_file_path = r"C:\Users\Rune\Documents\updated_testbilde.xlsx"
workbook.save(updated_excel_file_path)

Script after excecuted

>>> import openpyxl
>>> import requests
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'requests'
>>> from bs4 import BeautifulSoup
>>>
>>> # Load the Excel workbook
>>> excel_file_path = r"C:\Users\Rune\Documents\testbilde.xlsx"
>>> workbook = openpyxl.load_workbook(excel_file_path)
>>> sheet = workbook.active
>>>
>>> # Iterate through values in column C and process each search
>>> for row in sheet.iter_rows(min_row=2, min_col=3, max_col=3, values_only=True):
...     search_term = row[0]
...     search_url = f"https://www.google.com/search?q={search_term}&tbm=isch"
...
>>>     # Send a request to Google and parse the page with BeautifulSoup
>>>     response = requests.get(search_url)
  File "<stdin>", line 1
    response = requests.get(search_url)
IndentationError: unexpected indent
>>>     soup = BeautifulSoup(response.text, "html.parser")
  File "<stdin>", line 1
    soup = BeautifulSoup(response.text, "html.parser")
IndentationError: unexpected indent
>>>
>>>     # Find the first image result (if any)
>>>     image_results = soup.select(".rg_i")
  File "<stdin>", line 1
    image_results = soup.select(".rg_i")
IndentationError: unexpected indent
>>>     if image_results:
  File "<stdin>", line 1
    if image_results:
IndentationError: unexpected indent
>>>         first_image_url = image_results[0]["data-src"]
  File "<stdin>", line 1
    first_image_url = image_results[0]["data-src"]
IndentationError: unexpected indent
>>>
>>>         # Insert the image URL into the adjacent cell in column D
>>>         sheet.cell(row=row[0].row, column=4).value = first_image_url
  File "<stdin>", line 1
    sheet.cell(row=row[0].row, column=4).value = first_image_url
IndentationError: unexpected indent
>>>
>>> # Save the updated Excel workbook
>>> updated_excel_file_path = r"C:\Users\Rune\Documents\updated_testbilde.xlsx"
>>> workbook.save(updated_excel_file_path)
>>>

You are attempting to run your program by pasting it in at the interactive prompt. When run this way, Python assumes the indented block is finished when it runs into a blank line. Since your code breaks up an indented section with blank lines, you get an IndentationError.

You should open a text editor, paste your code, and save it with a .py extension. You can then run it on the command line by typing python my_program_name.py.

Here’s a tutorial that explains it better.

As well as Steven’s answer, you need to download the requests package.

1 Like