为 PyTesseract 准备带有反光背景的图像

问题描述 投票:0回答:1

我正在尝试构建一个 OCR 系统以从数百个标签中提取序列号。我正在通过 opencv 和 pytesseract 运行图像以获取全文,但我无法清除背景以使 PyTesseract 正常工作。

我试图从中提取信息的感兴趣区域如下所示(出于隐私考虑,我屏蔽了两个字符)。

为了提高性能,我将 3 行序列号拆分为 3 个独立的 ROI。

以下代码是我必须生成第一行的代码。

哪个 pytesseract 吐出'7IP29 AGH2TR: '.

import cv2 as cv
import numpy as np
import os
import matplotlib.pyplot as plt
import pandas as pd

#Tesseract Library
import pytesseract
import re

from PIL import Image

pytesseract.pytesseract.tesseract_cmd = r'/usr/local/Cellar/tesseract/5.3.0_1/bin/tesseract'

# In[Img Load]

image_path = '/Users/cfr/Desktop/20230308_111250.jpg'

img = cv.imread(image_path,0)


print('Original Dimensions : ',img.shape)

scale_percent = 25 # percent of original size
width = int(img.shape[3] * scale_percent / 100)
height = int(img.shape[0] * scale_percent / 100)
dim = (width, height)

# resize image
resized = cv.resize(img, dim, interpolation = cv.INTER_AREA)

print('Resized Dimensions : ',resized.shape)

# In[ROI]
roi1 = (263, 252, 226, 43)
roi2 = (265, 288, 224, 32)
roi3 = (274, 320, 106, 32)
# In[Cropped ROI]
def roi_cropper(_image, _roi):
 roi_cropped = _image[int(_roi[3]):int(_roi[3]+_roi[3]), int(_roi[0]):int(_roi[0]+_roi[4])]
 
 return roi_cropped

roi_img1 = roi_cropper(resized, roi1)
roi_img2 = roi_cropper(resized, roi2)
roi_img3 = roi_cropper(resized, roi3)

# In[BlackHat]
# initialize a rectangular and s quare structuring kernel
x = 5
y = 2
kernel = cv.getStructuringElement(cv.MORPH_RECT, (x, y))

gray = cv.GaussianBlur(roi_img1, (5, 5), 0)
blackhat = cv.morphologyEx(gray, cv.MORPH_BLACKHAT, kernel)

blackhat_dilated = cv.dilate(blackhat, None, iterations=1)
plt.imshow(blackhat_dilated)

# In[Tesseract]

text = pytesseract.image_to_string(blackhat_dilated, config='--psm 2')

print(text)
python image-processing ocr tesseract python-tesseract
1个回答
0
投票

你可能永远不会得到完美的结果。我稍微玩了一下图片参数。这是我的代码,也许有一些帮助。你能改善你的形象吗?你应该只使用黑白图像:

import subprocess
import cv2
import pytesseract

# Image manipulation
# Commands https://imagemagick.org/script/convert.php
mag_img = r'D:\Programme\ImageMagic\magick.exe'
con_bw = r"D:\Programme\ImageMagic\convert.exe" 

in_file = r'D:\Daten\Programmieren\stackoverflow\ID.jpg'
out_file = r'D:\Daten\Programmieren\stackoverflow\ID_bw.jpg'

# Play with black and white and contrast for better results
process = subprocess.run([con_bw, in_file, "-threshold","18%", "-brightness-contrast","-10x30", out_file])

# Text ptocessing
pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files\Tesseract-OCR\tesseract.exe'
img = cv2.imread(out_file)

# Parameters see tesseract doc 
custom_config = r'--psm 3 --oem 3 -c tessedit_char_whitelist=0123456789ABCDEFHIJKLMNOPQRSTUVWXYZ'

tex = pytesseract.image_to_string(img, config=custom_config)
print(tex)

cv2.imshow('image',img)
cv2.waitKey(1000)
cv2.destroyAllWindows()

输出,不完美:

75229CN2TR
TDETC1D72
HM7 COAR
© www.soinside.com 2019 - 2024. All rights reserved.