我正在尝试使用openCV和Azure read从表中提取文本。目标是明智地提取文本列。因此,要执行的第一步是检测image(table)中的垂直线。现在,使用这些垂直线的坐标作为极端边界,我们可以识别这些线之间的文本。
因此基于垂直线过滤器获得文本。
尽管脚本工作正常,但我观察到一种情况,即对于一种特定类型的表(类型A),行坐标不适当地出现。因此,在调试之后,我们发现问题出在表的标题部分(仅适用于Type A)。
因此,当我们消除(裁剪图像)表的标题部分(类型A)时,垂直线坐标是合适的。
坐标格式为(x,y,w,h)。x和y是垂直线的最高点。w是线的宽度。(在垂直线中最大为2像素)。h是垂直线的高度。
我在这里附上两种情况:1.带有标题的表格-错误的坐标。Actual Image,Binarized Vertical lines of Actual Image
带有标题的垂直线的坐标(从左到右)-[(9,0,14,439),(213,0,93,426),(337,28,1,398),(397,29,1,410),(470,29,1,397) ,(522,0,12,439)]
不带标题的垂直线的坐标(从左到右)-[(7,0,1,404),(303,0,1,391),(335,0,1,391),(395,0,1,404),(468,0,1,391) ,((531,0,1,404)]
我们可以观察到第二行的坐标变化很大,而其他行却很靠近。因此,问题在于,带有标题的图像中的第二条垂直线坐标不正确。可能是什么原因?
import numpy as np
import sys
import cv2 as cv
def show_wait_destroy(winname, img):
cv.imshow(winname, img)
cv.moveWindow(winname, 500, 0)
cv.waitKey(0)
cv.destroyWindow(winname)
def main(argv):
# [load_image]
# Check number of arguments
if len(argv) < 1:
print ('Not enough parameters')
print ('Usage:\nmorph_lines_detection.py < path_to_image >')
return -1
# Load the image
src = cv.imread(argv[0], cv.IMREAD_COLOR)
# Check if image is loaded fine
if src is None:
print ('Error opening image: ' + argv[0])
return -1
# Show source image
cv.imshow("src", src)
# [load_image]
# [gray]
# Transform source image to gray if it is not already
if len(src.shape) != 2:
gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY)
else:
gray = src
# Show gray image
show_wait_destroy("gray", gray)
# [gray]
# [bin]
# Apply adaptiveThreshold at the bitwise_not of gray, notice the ~ symbol
gray = cv.bitwise_not(gray)
bw = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_MEAN_C, \
cv.THRESH_BINARY, 15, -2)
# Show binary image
show_wait_destroy("binary", bw)
# [bin]
# [init]
# Create the images that will use to extract the horizontal and vertical lines
horizontal = np.copy(bw)
vertical = np.copy(bw)
# [init]
# [horiz]
# Specify size on horizontal axis
cols = horizontal.shape[1]
horizontal_size = cols // 30
# Create structure element for extracting horizontal lines through morphology operations
horizontalStructure = cv.getStructuringElement(cv.MORPH_RECT, (horizontal_size, 1))
# Apply morphology operations
horizontal = cv.erode(horizontal, horizontalStructure)
horizontal = cv.dilate(horizontal, horizontalStructure)
# Show extracted horizontal lines
show_wait_destroy("horizontal", horizontal)
# [horiz]
# [vert]
# Specify size on vertical axis
rows = vertical.shape[0]
verticalsize = rows // 10 #####--->>>>>This decides the threshold for vertical line
# Create structure element for extracting vertical lines through morphology operations
verticalStructure = cv.getStructuringElement(cv.MORPH_RECT, (1, verticalsize))
# Apply morphology operations
vertical = cv.erode(vertical, verticalStructure)
vertical = cv.dilate(vertical, verticalStructure)
# Show extracted vertical lines
show_wait_destroy("vertical", vertical)
# [vert]
# [smooth]
# Inverse vertical image
vertical = cv.bitwise_not(vertical)
show_wait_destroy("vertical_bit", vertical)
'''
Extract edges and smooth image according to the logic
1. extract edges
2. dilate(edges)
3. src.copyTo(smooth)
4. blur smooth img
5. smooth.copyTo(src, edges)
'''
# Step 1
edges = cv.adaptiveThreshold(vertical, 255, cv.ADAPTIVE_THRESH_MEAN_C, \
cv.THRESH_BINARY, 3, -2)
show_wait_destroy("edges", edges)
# Step 2
kernel = np.ones((2, 2), np.uint8)
edges = cv.dilate(edges, kernel)
show_wait_destroy("dilate", edges)
# Step 3
smooth = np.copy(vertical)
# Step 4
smooth = cv.blur(smooth, (2, 2))
# Step 5[![enter image description here][1]][1]
(rows, cols) = np.where(edges != 0)
vertical[rows, cols] = smooth[rows, cols]
# Show final result
show_wait_destroy("smooth - final", vertical)
# [smooth]
return 0
if __name__ == "__main__":
main(sys.argv[1:])
####to run the script use >>>>python image.py path/to/image