How can I make a script to search information in a spreadsheet?

Hi! I am semi comfortable with using python. Currently I am trying to make a script to search for information in a spreadsheet. Column A and B need to be combined into a query, and then searched with the rest of the column headers. Think of google dorking!

I already have ten APIs and ten CSE ids but I am just unsure of what is the best way to approach this. I have tried to use ChatGPT and Gemini to help in both using excel and google scripts but I keep having issues with the code printing out the results correctly. Any help or direction will be appreciated. Thanks!

In my opinion, the simplest one would be to use pandas.read_excel and do the rest with pandas.DataFrame that is returned.

Hi!

It sounds like an interesting project! Here’s a simple way to approach it using Python:

  1. Read the Spreadsheet: Use pandas to read and process the data.
  2. Combine Columns: Combine Column A and B into your query.
  3. Use APIs/CSE IDs: Make your searches using the combined queries and handle the API responses.

Here’s a basic outline to get you started:

  1. Install Required Libraries:
pip install pandas openpyxl requests

  1. Python Script:
import pandas as pd
import requests

# Read the spreadsheet
df = pd.read_excel('your_spreadsheet.xlsx')

# Combine columns A and B into a query
df['query'] = df['A'] + ' ' + df['B']

# Function to search using the API
def search_api(query, api_key, cse_id):
    url = f'https://www.googleapis.com/customsearch/v1?q={query}&key={api_key}&cx={cse_id}'
    response = requests.get(url)
    return response.json()

# Loop through your queries and print results
api_keys = ['your_api_key1', 'your_api_key2', ...]  # Add your 10 API keys
cse_ids = ['your_cse_id1', 'your_cse_id2', ...]   # Add your 10 CSE IDs

for index, row in df.iterrows():
    query = row['query']
    for api_key, cse_id in zip(api_keys, cse_ids):
        results = search_api(query, api_key, cse_id)
        # Print or process the results
        print(results)

# Save results back to Excel or process as needed

Make sure to replace 'your_spreadsheet.xlsx', 'your_api_key1', and 'your_cse_id1' with your actual file and keys.

Hope this helps!

Best,
RHJ

Why use Python when there are utilities to do this in a .csv file.

Use grep in linux or findstr in Windows cmd.exe.

Ex: findstr /i /n findthis file.csv

I guess you may want a command-line version run by python to search an actual Excel file for some other purpose as well.

Also, avoid using AI to get snippets of programming code. They often omit best practices or show just plain old code that no longer works, or shows code with modules that don’t work with a recent Python version.

Instead I do a search with a search engine where I can show only results from the past year. I have much better results from that.