从掩模图像(Tensorflow 或其他)计算边界框

问题描述 投票:0回答:2

我正在寻找将蒙版(高度 x 宽度布尔图像)转换为一系列边界框(请参见下面的示例图片,我手绘)的方法,其中框包围“真理之岛”。

具体来说,我正在寻找一种适用于标准 TensorFlow 操作的方法(尽管欢迎所有输入)。我想要这个,这样我就可以将模型转换为 TFLite,而无需添加自定义操作并从源代码重新编译。但总的来说,了解执行此操作的不同方法会很好。

备注:

  • 我已经有了一个涉及非标准 Tensorflow 的解决方案,基于 tfa.image.connected_components(请参阅解决方案此处)。然而该操作不包含在 Tensorflow Lite 中。它还感觉它做了一些比必要的稍微困难的事情(找到连接的组件比仅仅在图像上勾画出斑点而不担心它们是否连接更难)

  • 我知道我没有在这里准确指定如何我想要生成的框(例如,单独的“阴阳式”连接组件是否应该有单独的框,即使它们重叠,等等)。真的,我并不担心细节,只是最终的盒子看起来“合理”。

  • 一些相关问题(请在标记为重复之前阅读!):

  • 我理想地寻找不需要训练的东西(例如YOLO式回归)并且开箱即用(呵呵)。

编辑这是一个示例蒙版图像:https://github.com/petered/data/blob/master/images/example_mask3.png可以使用

将其加载到蒙版中
mask = cv2.imread(os.path.expanduser('~/Downloads/example_mask3.png')).mean(axis=2) > 50
tensorflow computer-vision image-segmentation tensorflow-lite bounding-box
2个回答
1
投票

好吧,不确定这是否仅适用于张量流操作,但这是一个 Python/Numpy 实现(它使用非常低效的双 for 循环)。原则上,如果矢量化(再次不确定是否可能)或用 C 编写,它应该会很快,因为它只对像素进行 2 次传递来计算框。

我不确定这个算法是否有现有的名称,但如果没有,我会称之为“Downright Boxing”,因为它涉及向下和向右扩展掩码段以找到框。 这是问题中掩模的结果(添加了一些额外的形状作为示例):

def mask_to_boxes(mask: Array['H,W', bool]) -> Array['N,4', int]: """ Convert a boolean (Height x Width) mask into a (N x 4) array of NON-OVERLAPPING bounding boxes surrounding "islands of truth" in the mask. Boxes indicate the (Left, Top, Right, Bottom) bounds of each island, with Right and Bottom being NON-INCLUSIVE (ie they point to the indices AFTER the island). This algorithm (Downright Boxing) does not necessarily put separate connected components into separate boxes. You can "cut out" the island-masks with boxes = mask_to_boxes(mask) island_masks = [mask[t:b, l:r] for l, t, r, b in boxes] """ max_ix = max(s+1 for s in mask.shape) # Use this to represent background # These arrays will be used to carry the "box start" indices down and to the right. x_ixs = np.full(mask.shape, fill_value=max_ix) y_ixs = np.full(mask.shape, fill_value=max_ix) # Propagate the earliest x-index in each segment to the bottom-right corner of the segment for i in range(mask.shape[0]): x_fill_ix = max_ix for j in range(mask.shape[1]): above_cell_ix = x_ixs[i-1, j] if i>0 else max_ix still_active = mask[i, j] or ((x_fill_ix != max_ix) and (above_cell_ix != max_ix)) x_fill_ix = min(x_fill_ix, j, above_cell_ix) if still_active else max_ix x_ixs[i, j] = x_fill_ix # Propagate the earliest y-index in each segment to the bottom-right corner of the segment for j in range(mask.shape[1]): y_fill_ix = max_ix for i in range(mask.shape[0]): left_cell_ix = y_ixs[i, j-1] if j>0 else max_ix still_active = mask[i, j] or ((y_fill_ix != max_ix) and (left_cell_ix != max_ix)) y_fill_ix = min(y_fill_ix, i, left_cell_ix) if still_active else max_ix y_ixs[i, j] = y_fill_ix # Find the bottom-right corners of each segment new_xstops = np.diff((x_ixs != max_ix).astype(np.int32), axis=1, append=False)==-1 new_ystops = np.diff((y_ixs != max_ix).astype(np.int32), axis=0, append=False)==-1 corner_mask = new_xstops & new_ystops y_stops, x_stops = np.array(np.nonzero(corner_mask)) # Extract the boxes, getting the top-right corners from the index arrays x_starts = x_ixs[y_stops, x_stops] y_starts = y_ixs[y_stops, x_stops] ltrb_boxes = np.hstack([x_starts[:, None], y_starts[:, None], x_stops[:, None]+1, y_stops[:, None]+1]) return ltrb_boxes



0
投票

from skimage.measure import label, regionprops # from skimage.morphology import label mask_0 = cv2.imread('delete.png') thresh = 127 mask_0 = cv2.threshold(mask_0, thresh, 255, cv2.THRESH_BINARY)[1] mask_1 = mask_0[:,:,0] lbl_0 = label(mask_1) props = regionprops(lbl_0) for prop in props: print('Found bbox', prop.bbox) cv2.rectangle(mask_0, (prop.bbox[1], prop.bbox[0]), (prop.bbox[3], prop.bbox[2]), (255, 0, 0), 2) plt.imshow(mask_0)

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.