Pytesseract 错误地将车牌 OCR 中的“Z”识别为“7”

Question

我正在使用 Python 和 OpenCV 开发车牌识别系统。该系统从图像或视频帧中提取并处理车牌区域，然后使用 pytesseract 进行光学字符识别（OCR）。然而，我面临一个问题，pytesseract 经常将字母“Z”误解为数字“7”。

这是我的代码的相关部分：

import numpy as np
import cv2 as opencv
import pytesseract

class LicensePlateProcessor:
    @staticmethod
    def process_license_plate(frame, x1, y1, x2, y2):
        # ... [initial steps: cropping, grayscaling, blurring, thresholding, inverting] ...

        # Find contours, calculate areas, and extract the largest contour
        contours, _ = opencv.findContours(inverted_license_plate, opencv.RETR_TREE, opencv.CHAIN_APPROX_SIMPLE)
        areas = np.array([opencv.contourArea(contour) for contour in contours])
        max_index = np.argmax(areas)
        contour = contours[max_index]

        # Calculate moments for centroid, and crop around the centroid
        moments = opencv.moments(contour)
        m00 = np.array(moments['m00'])
        m10 = np.array(moments['m10'])
        m01 = np.array(moments['m01'])
        cx, cy = np.divide([m10, m01], m00, out=np.zeros_like([m10, m01]), where=m00 != 0)
        cx, cy = cx.astype(int), cy.astype(int)
        height, width = inverted_license_plate.shape[:2]
        offset = np.array([np.add(np.floor_divide(width, 2.961), 30), np.floor_divide(height, 4.972)]).astype(np.int32)
        rect_top_left = np.clip([cx - offset[0], cy - offset[1]], 0, None)
        rect_bottom_right = np.clip([cx + offset[0], cy + offset[1]], None, inverted_license_plate.shape[::-1])
        cropped_image = inverted_license_plate[rect_top_left[1]:rect_bottom_right[1], rect_top_left[0]:rect_bottom_right[0]]

        # OCR using pytesseract on the cropped image
        recognized_text = pytesseract.image_to_string(lang='en', cropped_image, config='--psm 13 --oem 3')

        return recognized_text

# Example usage
frame = opencv.imread('license_plate_image.jpg')
recognized_plate = LicensePlateProcessor.process_license_plate(frame, x1, y1, x2, y2)
print(recognized_plate)

在将图像传递给 pytesseract 之前，我尝试了不同的预处理步骤，例如灰度、模糊、阈值处理和反转图像。尽管如此，“Z”经常被识别为“7”。

Answer 1

我有一个简单的解决方案可以解决您的问题。使用“阈值分割”或“对比度分割”之类的方法从图像中提取符号。您将不再拥有一张包含所有符号的图像，而是拥有 n 张图像，每张图像包含一个符号。

然后，将这些图像一张一张传给OCR。现在，Pytesseract 不会尝试立即读取整个文本；而是会尝试立即读取整个文本。相反，它将分别对每个符号进行分类。

接下来，创建您自己的神经网络 - 一个只有两个类的简单分类器：“7”和“Z”。每当 Pytesseract 返回 7 或 Z 时，请使用专门针对此问题进行训练的神经网络重新评估该符号。该解决方案在程序执行方面可能不是最快的，但它非常简单，应该可以有效地解决您的问题。

最好的解决方案当然是使用付费 OCR，比如 Google 的 OCR（效果非常好），或者从头开始编写自己的 OCR。但是，如果您需要一个简单的解决方案来读取车牌，这可能是最好的方法。

Pytesseract 错误地将车牌 OCR 中的“Z”识别为“7”

问题描述投票：0回答：1

1个回答

最新问题

Pytesseract 错误地将车牌 OCR 中的“Z”识别为“7”

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1