I have a folder of 1000 images all formatted the same way. need to create a text file that contains the name of each file (sampleimage.jpg) followed by the text that Python “reads” from a specific region of the images. It will be the same region for every image.
For example:
File001.jpg “Series 401 - Slice 655”
File002.jpg “Series 411 - Slice 428”
File003.jpg “Series 403 - Slice 286”
File004.jpg “Series 401 - Slice 689”
etc.
The attached jpg is one of the images in the folder. Personal information has been removed from the file for privacy.
The specific “region” that I want Python to “read” is in the same place for every image. In the sample image shown below, the text to be read is “Series 401 - Slice 655”. That text is in the lower-right quadrant of the image.
I have downloaded Python and now I will start learning to use it for the first time.
According to google, one option is this: “Using Python with OpenCV (for image processing) and Tesseract (for OCR) is the most efficient method for large batches, as it allows you to define a specific crop area. Tools: Python, cv2 (OpenCV), pytesseract. Method: Define the ROI coordinates (x, y, width, height) of the text area. Loop through the folder of images. Crop each image to that area. Apply pytesseract to extract the text. Save the output to a CSV or text file.”
It suggests this code:
import cv2
import pytesseract
import osDefine the region of interest [y:y+h, x:x+w]
YOU MUST CHANGE THESE COORDINATES
ROI = (100, 200, 300, 50)
def process_images(folder_path):
results =
for filename in os.listdir(folder_path):
if filename.endswith(“.jpg”) or filename.endswith(“.png”):
img = cv2.imread(os.path.join(folder_path, filename))Crop image
crop_img = img[ROI[1]:ROI[1]+ROI[3], ROI[0]:ROI[0]+ROI[2]]
OCR
text = pytesseract.image_to_string(crop_img)
results.append(f"{filename}: {text.strip()}")
return resultsRun the function on your image folder
extract_results = process_images(‘path/to/your/images’)
My questions are:
- As a new person, any tips so I can avoid common pitfalls?
- How do I determine the coordinates of the region to be read?
