我正在处理文本失真/旋转的图像。在对它们运行OCR之前,需要将这些文本Blob旋转回水平位置(0度)。我设法解决了旋转问题,但是现在我需要找到一种将原始轮廓的内容复制到旋转矩阵的方法。
以下是我提取并解决轮换问题的一些操作:
[我曾尝试使用仿射变换来旋转文本斑点,但最终却裁剪出一些文本,因为当我的文本斑点不规则时,它需要一个矩形。
轮廓中的蓝点是质心,数字是轮廓角。如何复制未旋转轮廓的内容,将其旋转并复制到新图像?
代码
def getContourCenter(contour):
M = cv2.moments(contour)
if M["m00"] != 0:
cx = int(M['m10']/M['m00'])
cy = int(M['m01']/M['m00'])
else:
return 0, 0
return int(cx), int(cy)
def rotateContour(contour, center: tuple, angle: float):
def cart2pol(x, y):
theta = np.arctan2(y, x)
rho = np.hypot(x, y)
return theta, rho
def pol2cart(theta, rho):
x = rho * np.cos(theta)
y = rho * np.sin(theta)
return x, y
# Translating the contour by subtracting the center with all the points
norm = contour - [center[0], center[1]]
# Convert the points to polar co-ordinates, add the rotation, and convert it back to Cartesian co-ordinates.
coordinates = norm[:, 0, :]
xs, ys = coordinates[:, 0], coordinates[:, 1]
thetas, rhos = cart2pol(xs, ys)
thetas = np.rad2deg(thetas)
thetas = (thetas + angle) % 360
thetas = np.deg2rad(thetas)
# Convert the new polar coordinates to cartesian co-ordinates
xs, ys = pol2cart(thetas, rhos)
norm[:, 0, 0] = xs
norm[:, 0, 1] = ys
rotated = norm + [center[0], center[1]]
rotated = rotated.astype(np.int32)
return rotated
def straightenText(image, vis):
# create a new mat
mask = 0*np.ones([image.shape[0], image.shape[1], 3], dtype=np.uint8)
# invert pixel index arrangement and dilate aggressively
dilate = cv2.dilate(~image, ImageUtils.box(33, 1))
# find contours
_, contours, hierarchy = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
for contour in contours:
[x, y, w, h] = cv2.boundingRect(contour)
if w > h:
# find contour angle and centers
(x, y), (w, h), angle = cv2.minAreaRect(contour)
cx, cy = getContourCenter(contour)
# fix angle returned
if w < h:
angle = 90 + angle
# fix contour angle
rotatedContour = rotateContour(contour, (cx, cy), 0-angle)
cv2.drawContours(vis, contour, -1, (0, 255, 0), 2)
cv2.drawContours(mask, rotatedContour, -1, (255, 0, 0), 2)
cv2.circle(vis, (cx, cy), 2, (0, 0, 255), 2, 8) # centroid
cv2.putText(vis, str(round(angle, 2)), (cx, cy), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255,0,0), 2)
这里是一种方法,它不是最有效的,但我认为可以在Python / OpenCV中完成的最简单方法。
为所需的输出创建白色的空白图像。
在输入中获取轮廓的旋转边界矩形。
在输出中获取轮廓的法线边界矩形。
分别获得四个边界框角。
计算两组四个角之间的仿射变换矩阵点。
扭曲输入图像。
使用输出边界框尺寸和左上角numpy切片将扭曲图像中的区域转移到相同白色输出图像中的区域。
对于每个文本轮廓,使用结果图像代替原始的白色图像作为新的目标图像。
所以这是一个模拟,向您展示如何。
源文本图像:
带有红色旋转矩形的源文本图像:
白色目标图像中所需的边界矩形:
将文本转换为白色图像并转换为所需的矩形区域:
代码:
import cv2
import numpy as np
# Read source text image.
src = cv2.imread('text_on_white.png')
hs, ws, cs = src.shape
# Read same text image with red rotated bounding box drawn.
src2 = cv2.imread('text2_on_white.png')
# Read white image showing desired output bounding box.
src2 = cv2.imread('text2_on_white.png')
# create white destination image
dst = np.full((hs,ws,cs), (255,255,255), dtype=np.uint8)
# define coordinates of bounding box in src
src_pts = np.float32([[51,123], [298,102], [300,135], [54,157]])
# size and placement of text in dest is (i.e. bounding box):
xd = 50
yd = 200
wd = 249
hd = 123
dst_pts = np.float32([[50,200], [298,200], [298,234], [50,234]])
# get affine transform
matrix = cv2.estimateRigidTransform(src_pts, dst_pts, 0)
# warp the source image
src_warped = cv2.warpAffine(src, matrix, (ws,hs), cv2.INTER_AREA, borderValue=(255,255,255))
# do numpy slicing on warped source and place in white destination
dst[yd:yd+hd, xd:xd+wd] = src_warped[yd:yd+hd, xd:xd+wd]
# show results
cv2.imshow('SRC', src)
cv2.imshow('SRC2', src2)
cv2.imshow('SRC_WARPED', src_warped)
cv2.imshow('DST', dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save results
cv2.imwrite('text_on_white_transferred.png', dst)