google vision API返回空的边界框顶点，而是返回normalized_vertexes

Question

我正在使用vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION提取pdf文档中的一些密集文本。这是我的代码：

from google.cloud import vision

def extract_text(bucket, filename, mimetype):
    print('Looking for text in PDF {}'.format(filename))
    # BATCH_SIZE; How many pages should be grouped into each json output file.
    # """OCR with PDF/TIFF as source files on GCS"""

    # Detect text
    feature = vision.types.Feature(
        type=vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION)
    # Extract text from source bucket
    gcs_source_uri = 'gs://{}/{}'.format(bucket, filename)
    gcs_source = vision.types.GcsSource(uri=gcs_source_uri)
    input_config = vision.types.InputConfig(
        gcs_source=gcs_source, mime_type=mimetype)

    request = vision.types.AnnotateFileRequest(features=[feature], input_config=input_config)

    print('Waiting for the ORC operation to finish.')
    ocr_response = vision_client.batch_annotate_files(requests=[request])

    print('OCR completed.')

[响应中，我希望在ocr_response.responses[1...n].pages[1...n].blocks[1...n].bounding_box中找到一个已填写的vertices列表，但是此列表为空。而是有一个normalized_vertices列表，它们是0到1之间的归一化顶点。为什么会这样呢？为什么vertices结构为空？我正在关注this文章，那里的作者使用了vertices，但我不明白为什么我没有得到它们。要将其转换为非规范化形式，我将规范化的顶点乘以高度和宽度，但结果很糟糕，框的位置不正确。

Answer 1

要将Normalized Vertex转换为Vertex，您应该将NormalizedVertex的x字段乘以宽度值以获得Vertex的x字段，并将NormalizedVertex的y字段乘以高度值以获得Vertex的y。] >

[C0为什么获得规范化的顶点，而中型文章的作者获得顶点的原因是因为TEXT_DETECTION和DOCUMENT_TEXT_DETECTION模型自2020年5月15日以来已升级到较新版本，而中型文章于2018年12月25日撰写。

要使用旧模型生成结果，必须在Feature对象的model字段中指定“ builtin / legacy_20190601”才能获得旧模型的结果。

但是Google的文档提到，到2020年11月15日之后，将不再提供旧型号。

google vision API返回空的边界框顶点，而是返回normalized_vertexes

问题描述投票：0回答：1

1个回答

最新问题

google vision API返回空的边界框顶点，而是返回normalized_vertexes

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1