如何在opencv中从完全水平结构的非网格样式图像中提取字符串?

问题描述 投票:0回答:1

我正在与opencv和pytesseract一起工作。我想提取水平结构中的字符串(检查图像)。

我正在尝试使用getStructureElement进行结构化以进行扩张,但是我的代码跳到图像中心的下一行,它从图像左侧提取字符串,然后从所有左侧提取字符串然后移至图像右侧。

代码是:

import cv2, import pytesseract, from PIL import Image

image = cv2.imread("report_name-1.jpg")

#preprocessing 

gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY) # grayscale

thresh = cv2.threshold(gray,150,255,cv2.THRESH_BINARY_INV) # threshold

kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))

dilated = cv2.erode(thresh,kernel,iterations = 13) # dilate

contours, hierarchy =cv2.findContours(dilated,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE) # get contours

# get rectangle bounding contour
[x,y,w,h] = cv2.boundingRect(contour)
# discard areas that are too large
if h>300 and w>300:
    continue

# discard areas that are too small
if h<40 or w<40:
    continue

# draw rectangle around contour on original image
cv2.rectangle(image,(x,y),(x+w,y+h),(255,0,255),2)

This image report-image

python opencv python-tesseract opencv-contour image-preprocessing
1个回答
0
投票

我正在使用opencv 4.1.1。抱歉,我现在上传了盒子图片。请检查一下。您可以看到这些框在水平轴上彼此分开。 IMAGE OF BOX

© www.soinside.com 2019 - 2024. All rights reserved.