无法在Python中使用Tesseract OCR从图像中提取数字

问题描述 投票:0回答:1

我目前正在开发一个项目,需要在 Python 中使用 Tesseract OCR 从图像中提取数字。然而,我在获得准确结果方面面临困难。

这是图片:enter image description here

这是我的代码:

from PIL import Image, ImageEnhance, ImageOps
import pytesseract

# Load the screenshot
screenshot = Image.open("screenshot2.png")

# Crop the region containing the numbers
bbox = (970, 640, 1045, 675)
cropped_image = screenshot.crop(bbox)

# Enhance contrast
enhancer = ImageEnhance.Contrast(cropped_image)
enhanced_image = enhancer.enhance(2.0)

# Convert to grayscale
gray_image = enhanced_image.convert("L")

# Apply thresholding
thresholded_image = gray_image.point(lambda p: p > 150 and 255)

# Invert colors
inverted_image = ImageOps.invert(thresholded_image)

# Convert to binary
binary_image = inverted_image.convert("1")

# Save the processed image
binary_image.save("processed_image.png")

# Perform OCR
text = pytesseract.image_to_string(binary_image, config="--psm 13")

# Extract numbers from the OCR result
numbers = [int(num) for num in text.split() if num.isdigit()]

print(numbers)

输出很简单:[]

我想要的是只得到2个数字。但如果我可以检索所有文本,那么我可以处理字符串以仅获取 2 个数字。

这是我迄今为止尝试过的:

I've captured screenshots containing numeric values.
I've cropped the screenshots to focus only on the region containing the numbers.
I've enhanced the contrast and converted the images to grayscale to improve OCR accuracy.
I've applied thresholding and inverted the colors to prepare the images for OCR.
I've tried converting the images to binary format for better recognition.

尽管尝试了这些预处理步骤并调整 OCR 配置(例如,使用 --psm 13),我仍然无法准确地从图像中提取数字。 OCR 输出要么包含不正确的数字,要么根本无法检测到任何数字。

我感谢任何有关如何提高 OCR 提取过程准确性的见解或建议。谢谢!

python python-tesseract
1个回答
0
投票

所以我做了一些研究并发现了这篇文章:https://nanonets.com/blog/ocr-with-tesseract/

!!该代码是通过使用 Filip Zelic 和 Anuj Sable 创建的网站文章创建的!!

import cv2
import pytesseract
import numpy as np
from PIL import Image

# Read the image
image = cv2.imread('screenshot2.png')

# Coordinates from PIL cropping (left, top, right, bottom)
left = 970
top = 640
right = 1045
bottom = 675

# Convert to OpenCV coordinates (x, y, width, height)
x = left
y = top
width = right - left
height = bottom - top

# Crop the image
cropped_image = image[y:y+height, x:x+width]

# Convert the cropped image numpy array to a PIL image object
cropped_pil_image = Image.fromarray(cv2.cvtColor(cropped_image, cv2.COLOR_BGR2RGB))

# Convert the PIL image object to a numpy array
cropped_array = np.array(cropped_pil_image)

# Convert the image to grayscale
gray = cv2.cvtColor(cropped_array, cv2.COLOR_BGR2GRAY)

# Perform OCR on the grayscale image
ocr_text = pytesseract.image_to_string(gray)

print(ocr_text)

通过这种方法,我们可以一致地从提供的图像中提取数字 [-2, -1] ,从而获得更高的准确性。

感谢您的耐心配合,寻找有效的解决方案。如果您还有任何疑问或需要额外帮助,请随时询问!

© www.soinside.com 2019 - 2024. All rights reserved.