从OpenCV中的图像中删除选定的元素

Question

我有this图像与表格，我想从图像中删除表格结构，以便它可以更有效地使用Tesseract。我使用以下代码在表（和单个单元格）周围创建边界，以便可以删除它。

img =cv2.imread('bfir.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
img1 = np.ones(img.shape, dtype=np.uint8)*255
ret,thresh = cv2.threshold(gray,127,255,1)
(_,contours,h) = cv2.findContours(thresh,1,2)

for cnt in contours:
    approx = cv2.approxPolyDP(cnt,0.01*cv2.arcLength(cnt,True),True)
    if len(approx)==4:
        cv2.drawContours(img1,[cnt],0,(0,255,0),2)

这会像this图像一样在桌子周围绘制绿线。

接下来，我尝试使用cv2.subtract方法从图像中减去表格，有点像这样。

final_img = cv2.subtract(img1, img)

但是这并没有像我预期的那样工作，并且给了我一张灰度图像，桌子还在里面。 Link

虽然我只想在B＆W中删除原始图像。我是第一次使用OpenCV，所以我不知道我做错了什么，我很抱歉这篇文章很长但是如果有人可以帮忙解决这个问题，或者只是指出我正确的方向如何删除表，非常感谢。

编辑：正如RobAu所建议的，它也可以简单地绘制白色轮廓，但我不知道如何做到这一点，而不会丢失预处理阶段的其余数据。

Answer 1

您可以尝试简单地覆盖代表边框的单元格。这可以通过创建蒙版图像，然后使用它作为参考来覆盖原始像素的位置来完成。

这可以通过以下方式完成：

mask_image = np.zeros(img.shape[0:2], np.uint8)    
cv2.drawContours(mask_image, contours, -1, color=255, thickness=2)
border_points = np.array(np.where(mask_image == 255)).transpose()
background = [0, 0, 0] # Change this to the colour you want
for point in border_points :
    img[point[0], point[1]] = background

更新：

您可以使用已为掩码创建的3通道，但这会使算法稍微复杂化。提出的蒙版图像更适合任务，但我会尝试使其适应您的代码：

# Create your mask image as usual...
border_points = np.array(np.where(img1[:,:,1] == 255)).transpose() # Only look at channel 2
background = [0, 0, 0] # Change this to the colour you want
for point in border_points :
    img[point[0], point[1]] = background

更新为@RobAu建议（比我以前的方法更快）：

line_thickness = 3  # Change this value until it looks the best.
cv2.drawContours(img, contours, -1, color=(0,0,0), thickness=line_thickness )

请注意我没有测试此代码。所以它可能需要一些进一步的摆弄。

Answer 2

作为对这个问题的评论的参考，这是一个代码的示例，该代码定位矩形并为每个矩形创建新图像，这是尝试创建碎纸图片的单个图像。需要更改某些值，以便找到具有适当大小的矩形

还有一些用于跟踪图像大小的代码，代码由我写的50％和stackoverflow帮助的50％组成。

import cv2
import numpy as np

fileName = ['9','8','7','6','5','4','3','2','1','0']

img = cv2.imread('#YOUR IMAGE#')

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.bilateralFilter(gray, 11, 17, 17)

kernel = np.ones((5,5),np.uint8)
erosion = cv2.erode(gray,kernel,iterations = 2)
kernel = np.ones((4,4),np.uint8)
dilation = cv2.dilate(erosion,kernel,iterations = 2)

edged = cv2.Canny(dilation, 30, 200)

_, contours, hierarchy = cv2.findContours(edged, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

rects = [cv2.boundingRect(cnt) for cnt in contours]
rects = sorted(rects,key=lambda  x:x[1],reverse=True)


i = -1
j = 1
y_old = 5000
x_old = 5000
for rect in rects:
    x,y,w,h = rect
    area = w * h
    print('width: %d and height: %d' %(w,h))
    if   w > 50 and h > 500:
        print('abs:')
        print(abs(x_old - x))
        if abs(x_old - x) > 0:
            print('writing')
            x_old = x
            x,y,w,h = rect

            out = img[y+10:y+h-10,x+10:x+w-10]
            cv2.imwrite('assets/newImage' + fileName[i] + '.jpg', out)

            j+=1
        if (y_old - y) > 1000:
            i += 1
            y_old = y

Answer 3

即使给定的输入图像链接不起作用，所以我显然不知道以下是你所要求的，我从你的问题中学到了一些东西，当我在工作时，从给定的图像中删除表格结构线，我喜欢分享我所学到的，为未来的读者。

我按照opencv documentation提供的步骤删除了线条。但那只删除了水平线。当我尝试删除垂直线时，结果图像只有垂直线。表中的文字不存在。

然后我遇到了你的问题并在问题中看到了final_img = cv2.subtract(img1, img)。试过这个并且效果很好。

以下是我遵循的步骤：

# Load the image
src = cv.imread(argv[0], cv.IMREAD_COLOR)
# Check if image is loaded fine
if src is None:
    print ('Error opening image: ' + argv[0])
    return -1
# Show source image
cv.imshow("src", src)
# [load_image]
# [gray]
# Transform source image to gray if it is not already
if len(src.shape) != 2:
    gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY)
else:
    gray = src
# Show gray image
# show_wait_destroy("gray", gray)
# [gray]
# [bin]
# Apply adaptiveThreshold at the bitwise_not of gray, notice the ~ symbol
gray = cv.bitwise_not(gray)
bw = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_MEAN_C, \
                            cv.THRESH_BINARY, 15, -2)
# Show binary image
# show_wait_destroy("binary", bw)
# [bin]
# [init]
# Create the images that will use to extract the horizontal and vertical lines
horizontal = np.copy(bw)
vertical = np.copy(bw)

# [horiz]
# [vert]
# Specify size on vertical axis
rows = vertical.shape[0]
verticalsize = rows / 10
# Create structure element for extracting vertical lines through morphology operations
verticalStructure = cv.getStructuringElement(cv.MORPH_RECT, (1, verticalsize))
# Apply morphology operations
vertical = cv.erode(vertical, verticalStructure)
vertical = cv.dilate(vertical, verticalStructure)

# [init]
# [horiz]
# Specify size on horizontal axis
cols = horizontal.shape[1]
horizontal_size = cols / 30
# Create structure element for extracting horizontal lines through morphology operations
horizontalStructure = cv.getStructuringElement(cv.MORPH_RECT, (horizontal_size, 1))
# Apply morphology operations
horizontal = cv.erode(horizontal, horizontalStructure)
horizontal = cv.dilate(horizontal, horizontalStructure)
lines_removed = cv.subtract(gray, vertical + horizontal)

show_wait_destroy("lines_removed", ~lines_removed)

Input:

Output:

我从源头改变了一些事情：

verticalsize = rows / 10，在这里，我不明白数字10的意义。在文档中，使用了30。我用10得到了更好的结果。我猜，分区数越少，结构元素就越大，因为我们以直线为目标，减少数量。
在文档中，垂直线在水平线之后处理。我推翻了订单
我把参数换成了cv2.substract()。我用过cv2.subtract(img, img1)。

从OpenCV中的图像中删除选定的元素

问题描述投票：0回答：3

3个回答

Input:

Output:

最新问题

从OpenCV中的图像中删除选定的元素

问题描述 投票：0回答：3

3个回答

Input:

Output:

最新问题

问题描述投票：0回答：3