Char 的边界框顶点顺序

Question

Google Vision API 文档指出，检测到的字符的顶点将始终保持相同的顺序：

// The bounding box for the symbol.
// The vertices are in the order of top-left, top-right, bottom-right,
// bottom-left. When a rotation of the bounding box is detected the rotation
// is represented as around the top-left corner as defined when the text is
// read in the 'natural' orientation.
// For example:
//   * when the text is horizontal it might look like:
//      0----1
//      |    |
//      3----2
//   * when it's rotated 180 degrees around the top-left corner it becomes:
//      2----3
//      |    |
//      1----0
//   and the vertice order will still be (0, 1, 2, 3).

但是有时我可以看到不同的顶点顺序。这是来自同一图像的两个字符的示例，它们具有相同的方向：

[x:778 y:316  x:793 y:316  x:793 y:323  x:778 y:323 ]
0----1
|    |
3----2

和

[x:857 y:295  x:857 y:287  x:874 y:287  x:874 y:295 ]
1----2
|    |
0----3

为什么顶点的顺序不一样？不像文档中那样？

Answer 1

这似乎是 Vision API 中的一个错误。解决方案是检测图像方向，然后以正确的顺序重新排列顶点。

不幸的是，Vision API 在其输出中不提供图像方向，因此我必须编写代码来检测它。

可以通过比较字符高度和宽度来检测水平/垂直方向。高度通常大于宽度。

下一步是检测文本的方向。例如，在垂直图像方向的情况下，文本可能从上到下或从下到上。

输出中的大多数字符似乎都以自然方式出现。因此，通过查看统计数据，我们可以检测文本方向。例如：第 1 行的 Y 坐标为 1000 第 2 行的 Y 坐标为 900 第 3 行的 Y 坐标为 950 第 4 行的 Y 坐标为 800 我们可以看到图像上下颠倒了。

Answer 2

You must to reorder vertices of four poins(clockwise inverted from A to D):
A-B-C-D that:
A: min X, min Y
B: max X, min Y
C: max X, max Y
D: min X, max Y

And save to your rectangle object.

更新：对于上面的 A-B-C-D 顺序，您可以按距 O(0,0) 的距离对顶点进行排序。

Answer 3

我在原始帖子发布多年后才看到这个边界框顶点顺序问题。我见过两个使用文本检测调用的实例。两者都在 bb 中表示多字符词。该列表不是以单词的 UL 开头并以 CW 顺序在单词周围给出顶点，而是以 LL 开头。

除了测试符号长宽比之外的任何建议，这对我来说似乎不太可靠。

谢谢。

Char 的边界框顶点顺序

问题描述投票：0回答：3

3个回答

最新问题

Char 的边界框顶点顺序

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3