Tesseract OCR 无法读取红色背景上的两行文本

Question

我在代码中遇到了 pytesseract 问题。我已尽力使代码尽可能易于阅读，但如果有任何不合理的地方，请告诉我。问题似乎出在一个小片段上，特别是

extract_gravnumbers_from_red

函数。

令人惊讶的是，

extract_gravnumbers_from_yellow

功能运行得非常好，可以正确读取文本。有关我已尝试解决该问题的其他信息将在下一节中提供。

以下是正在使用的测试图像：

完整代码如下：

import cv2
import pytesseract

# Enter the path to the Tesseract OCR executable
pytesseract.pytesseract.tesseract_cmd = r'path_to_teseract\Tesseract\tesseract.exe'

# Function to extract the grave numbers from a selected region
def extract_gravnumbers_from_yellow(region):
    # Use Tesseract OCR to get the text from the region
    grave_numbers_text = pytesseract.image_to_string(region, config='--psm 6')
    # Transforming the OCR result into a list of grave numbers
    grave_numbers_list = [int(num) for num in grave_numbers_text.split() if num.isdigit()]
    
    try:
        # If there is more than one grave number, return the smallest one
        if len(grave_numbers_list) > 1:
            min_grave_number = min(grave_numbers_list)
            print(min_grave_number)
        else:
            print("No Value")
            min_grave_number = None
    except ValueError:
        # If the list is empty, set None.
        min_grave_number = None
        print("List Empty")

    # Return min_grav_number
    return min_grave_number

def extract_gravnumbers_from_red(region):
    # Use Tesseract OCR to get the text from the region
    grave_numbers_text = pytesseract.image_to_string(region, config='--psm 6')

    # Transforming the OCR result into a list of grave numbers
    grave_numbers_list = [int(num) for num in grave_numbers_text.split() if num.isdigit()]
    
    try:
        # If there is more than one grave number, return the smallest one
        if len(grave_numbers_list) > 1:
            min_grave_number = min(grave_numbers_list)
            print(min_grave_number)
        else:
            print("No Value")
            min_grave_number = None
    except ValueError:
        # If the list is empty, set None.
        min_grave_number = None
        print("List Empty")
    # Return min_grave_number
    return min_grave_number

def get_red_grave_numbers(red_color, hsv_image, image, grav_numbers_list):
    # Define the color range of the red markings in the HSV space.
    lower_red, upper_red = red_color
    
    mask_red = cv2.inRange(hsv_image, lower_red, upper_red)
    
    # Find the contours of the red markings
    red_contours, _ = cv2.findContours(mask_red, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    # Iterate over the found contours of the red markings
    for red_contour in red_contours:
        # Draw a rectangle around the red mark (only to show the result).
        x, y, w, h = cv2.boundingRect(red_contour)
        cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 255), 2)

        # Extract the grave number from the red marker
        red_region = image[y:y + h, x:x + w]

        # Extract the grave number from the red marker and add it to the list.
        min_grav_number = extract_gravnumbers_from_red(red_region)
        if min_grav_number is not None:
            grav_numbers_list.append(min_grav_number)
    return grav_numbers_list

def get_yellow_grave_numbers(yellow_color, hsv_image, image, grav_numbers_list):
    # Define the color range of the yellow markings in the HSV space.
    lower_yellow, upper_yellow = yellow_color
    
    mask_yellow = cv2.inRange(hsv_image, lower_yellow, upper_yellow)
    
    # Find the contours of the yellow markings
    yellow_contours, _ = cv2.findContours(mask_yellow, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    # Iterate over the found contours of the yellow markings
    for yellow_contour in yellow_contours:
        # Draw a rectangle around the yellow mark (only to show the result).
        x, y, w, h = cv2.boundingRect(yellow_contour)
        cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 255), 2)

        # Extract the grave number from the yellow marker
        yellow_region = image[y:y + h, x:x + w]
        # Extract the grave number from the red marker and add it to the list.
        min_grav_number = extract_gravnumbers_from_yellow(yellow_region)
        if min_grav_number is not None:
            grav_numbers_list.append(min_grav_number)
    return grav_numbers_list

def check_double_number(filename, kvarter):
    # Load the image
    image = cv2.imread(filename)
    # Print image dimensions to check if the image is loaded correctly
    print("Image shape:", image.shape)
    print(f"Kvarter: {kvarter}")

    # Yellow color HSV Color Code
    yellow_color = ((35, 249, 245), (35, 249, 245))

    # Red color HSV Color Code
    red_color = ((7, 208, 190), (7, 208, 190))

    # Convert the image to the correct color space (BGR to HSV)
    hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

    # Create a list to store the grave numbers
    grav_numbers_list = []

    # Quick Debug print
    print("\nYellow mark is being processed:\n")
    grav_numbers_list = get_yellow_grave_numbers(yellow_color, hsv_image, image, grav_numbers_list)
    
    # Quick Debug print
    print(f"\nRed mark is being processed:\n")
    grav_numbers_list = get_red_grave_numbers(red_color, hsv_image, image, grav_numbers_list)
    print("\n")
    # Display Result Window
    cv2.imshow('Result', image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

    return grav_numbers_list

def create_doublegrave_list():
    kvarter = 2
    doublegraveList = []

    for kv in range(1, kvarter + 1): 
        kv = check_double_number(f"kvarter_image{kv}.png", kv)
        doublegraveList.append(kv)

    return doublegraveList

dubbelgravLista = create_doublegrave_list()
print(dubbelgravLista)

我不会使用块引号作为引号，而是为了改善问题的结构

看起来，导致

extract_gravnumbers_from_red

函数中无法识别文本的行可能是这一行：

grave_numbers_text = pytesseract.image_to_string(region, config='--psm 6')

尝试时：

min_grave_number = min(grave_numbers_list)
print(min_grave_number)

在

else

语句中，会引发 ValueError，因为它返回空结果，因为没有找到整数。这个问题似乎特定于读取红色区域上的两“行”文本，因为它适用于黄色区域。

在第一张图像的

grave_numbers_text

语句中打印

else

时，我得到以下输出：

Red mark is being processed:

No Value
ize

No Value
ro

No Value
nz

所以这显然是在阅读并得到一些东西，但这是不对的。尽管在第二张图片上它确实是正确的：

Red mark is being processed:

No Value
146

正确的区域仅包含一行文本。所以我的理论是，它无法读取两“行”文本，因为我缺少某种设置，再加上不同的对比度，因为它确实在黄色区域上正确显示。我知道这与图像的质量无关，因为我尝试使用更大的

kvarter_image1.png

图像来运行它。

我已经尝试过

config

参数并尝试了各种选项，例如

--psm 11

、

--psm 12

、

--psm 7

和

--psm 4

，但不幸的是，它没有任何区别。由于

config

参数可能不会影响对比度，这可能是主要挑战。

我已经显示了红色区域，这就是结果：

看起来应该如此。

尝试了一种替代方法，使用颜色过滤和直方图均衡，如下所示：

red_text_mask = cv2.inRange(region, (63, 161, 76), (7, 208, 190))
red_text = cv2.bitwise_and(region, region, mask=red_text_mask)

gray_text = cv2.cvtColor(red_text, cv2.COLOR_BGR2GRAY)

equalized_text = cv2.equalizeHist(gray_text)

grave_numbers_text = pytesseract.image_to_string(equalized_text, config='--psm 6')

使用的颜色代码是HSV，与

red_color

变量相同。我在这个程序中选择它得到的黑色代码：

import cv2
import numpy as np

def get_hsv_color(event, x, y, flags, param):
if event == cv2.EVENT_LBUTTONDOWN:
pixel_value = image\[y, x\]
hsv_pixel_value = cv2.cvtColor(np.uint8(\[\[pixel_value\]\]), cv2.COLOR_BGR2HSV)
print("HSV Value:", hsv_pixel_value)

Read the image

image = cv2.imread('kvarter_image1.png')

Create a window to display the image

cv2.namedWindow('Image')
cv2.imshow('Image', image)

Set the mouse callback function

cv2.setMouseCallback('Image', get_hsv_color)

Wait for a key event to exit

cv2.waitKey(0)
cv2.destroyAllWindows()

但是，即使进行了此修改，结果仍然保持不变，并且当尝试在

else

语句中打印它时，ValueError 仍然存在，因为

grave_numbers_text

变量仍为空。

阅读红底黑字似乎对 OCR 过程提出了挑战。缺乏对比度可能会妨碍文本的准确识别。或者我的配置可能有问题。任何有关提高黑色文本红色区域 OCR 性能的见解或建议将受到高度赞赏。谢谢您的协助！

Answer 1

tesseract

基于文献阅读模式。我建议您使用

EasyOCR

，这有助于阅读现实生活中的任何类型的文本。用法非常简单我也尝试了您的示例输入图像。首先，您需要安装 EasyOCR：

pip install easyocr

这是代码和带有图像的结果。结果还包括属于以下检测到的文本的坐标。您可以检查输出并看看它有多准确：

import easyocr
reader = easyocr.Reader(['ch_sim','en'])
result = reader.readtext('/home/ubuntu/ab.png')

print(result)

结果：

[([[633, 157], [729, 157], [729, 201], [633, 201]], '123', 0.970589095852502), ([[635, 200], [735, 200], [735, 248], [635, 248]], '124', 0.9784687161445618), ([[514, 234], [612, 234] , [612, 282], [514, 282]], '125', 0.9959679843082824), ([[514, 276], [614, 276], [614, 324], [514, 324]], '120', 0.9427849510848981), ([[378, 302], [476, 302], [476, 350], [378, 350]], '127', 0.8409048914909363), ([[723, 351], [821, 351], [821, 395], [723, 395]], '166', 0.9964757966703823), ([[260, 368], [360, 368], [360, 416], [260, 416]], '128', 0.9026410027274797), ([[161, 443], [257, 443] , [257, 487], [161, 487]], '129', 0.9528453507814674), ([[595, 439], [689, 439], [689, 483], [595, 483]], '164', 0.9960849252898073), ([[6, 490], [104, 490], [104, 538], [6, 538]], '130', 0.948180079460144), ([[595, 483], [691, 483], [691, 527], [595, 527]], '465', 0.5173705816268921), ([[11, 533], [105, 533], [105, 577], [11, 577]], '131', 0.9920455440102758), ([[438, 526], [540, 526], [540, 574], [438, 574]], '-0', 0.002514158801917607), ([[333, 587], [431, 587 ], [431, 631], [333, 631]], '162', 0.9630756538371861), ([[186, 664], [295, 664], [295, 737], [186, 737]], '4', 0.01427749452419752)]

Tesseract OCR 无法读取红色背景上的两行文本

问题描述投票：0回答：1

1个回答

最新问题

Tesseract OCR 无法读取红色背景上的两行文本

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1