Azure表单识别器培训未找到数据

问题描述 投票:0回答:1

我正在尝试使用浏览器API控制台(https://eastus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api/operations/TrainCustomModel/console)训练表单识别器。我已将转换图像上传到容器并创建了SAS。浏览器API控制台生成以下HTTP请求:

POST https://eastus.api.cognitive.microsoft.com/formrecognizer/v1.0-preview/custom/train?source=https://pythonimages.blob.core.windows.net/?sv=2019-02-02&ss=bfqt&srt=sco&sp=rl&se=2020-01-22T00:23:33Z&st=2020-01-21T16:23:33Z&spr=https&sig=••••••••••••••••••••••••••••••••&prefix=images HTTP/1.1
Host: eastus.api.cognitive.microsoft.com
Content-Type: application/json
Ocp-Apim-Subscription-Key: ••••••••••••••••••••••••••••••••

{
  "source": "string",
  "sourceFilter": {
    "prefix": "string",
    "includeSubFolders": true
  }
}

但是,我得到的答案是

Transfer-Encoding: chunked
x-envoy-upstream-service-time: 4
apim-request-id: 5ad37aa2-e251-4b61-98ae-023930b47d27
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
Date: Tue, 21 Jan 2020 16:25:03 GMT
Content-Type: application/json; charset=utf-8

{
  "error": {
    "code": "1004",
    "message": "Dataset path must be relative to local input mount path '/input' if local data is referenced."
  }
}

我不明白为什么它似乎要在本地查找数据。我已经尝试过SAS,例如包括Blob http地址中的容器名称(图像),而不是作为查询参数,但到目前为止没有成功。

我也尝试过Python / REST路径(在此描述:https://docs.microsoft.com/en-gb/azure/cognitive-services/form-recognizer/quickstarts/python-train-extract-v1),这会导致不同的错误:

Response status code: 408
Response body: {'error': {'code': '1011', 'innerError': {'requestId': 'e7f9ef9f-97bc-4b6a-86f3-0b29c9591c87'}, 'message': 'The operation exceeded allowed time limit and was canceled. The common reasons are that the data source is too large or contains unsupported content. Please check that your request conforms to service limits and retry with redacted data source.'}}

为了完整起见,我使用的代码如下(密钥/签名*为明:)

########### Python Form Recognizer Train #############
from requests import post as http_post

# Endpoint URL
base_url = r"https://markusformsrecognizer.cognitiveservices.azure.com/" + "/formrecognizer/v1.0-preview/custom"
source = r"https://pythonimages.blob.core.windows.net/images?sv=2019-02-02&ss=bfqt&srt=sco&sp=rl&se=2020-01-22T15:37:26Z&st=2020-01-22T07:37:26Z&spr=https&sig=*********************************"
headers = {
    # Request headers
    'Content-Type': 'application/json',
    'Ocp-Apim-Subscription-Key': '*********************************',
}
url = base_url + "/train" 
body = {"source": source}
try:
    resp = http_post(url = url, json = body, headers = headers)
    print("Response status code: %d" % resp.status_code)
    print("Response body: %s" % resp.json())
except Exception as e:
    print(str(e))
azure-storage-blobs microsoft-cognitive azure-blob-storage azure-cognitive-services
1个回答
0
投票

对于“源”:“字符串”,请按照以下步骤获取包含培训文档并用作源的源路径。替换为Azure Blob存储容器的共享访问签名(SAS)URL。若要检索SAS URL,请打开Microsoft Azure存储资源管理器,右键单击您的容器,然后选择“获取共享访问签名”。确保已选中“读取”和“列表”权限,然后单击“创建”。然后将值复制到URL部分。它的格式应为:https://.blob.core.windows.net/ ?.

© www.soinside.com 2019 - 2024. All rights reserved.