无法从 Python 脚本创建和运行 Azure ML Text NER 作业

Question

我正在尝试从 Python 脚本触发 Azure ML 服务上的文本 NER 作业，并将训练和验证文件夹从本地路径上传到数据存储上。代码如下：

import os

from azure.identity import DefaultAzureCredential
from azure.ai.ml import automl, Input, MLClient
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.entities import ResourceConfiguration

os.environ["AZURE_CLIENT_ID"] = <my_client_id>
os.environ["AZURE_TENANT_ID"] = <my_tenant_id>
os.environ["AZURE_CLIENT_SECRET"] = <my_client_secret_id>

subscription_id = <my_subscription_id>
resource_group = <my_resource_group_id>
workspace = <my_workspace_id>

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

training_mltable_path = "./training-mltable-folder/"
validation_mltable_path = "./validation-mltable-folder/"

my_training_data_input = Input(type=AssetTypes.MLTABLE, path=training_mltable_path)
my_validation_data_input = Input(type=AssetTypes.MLTABLE, path=validation_mltable_path)

text_ner_job = automl.text_ner(
    name="dpv2-nlp-text-ner-job-01",
    experiment_name="dpv2-nlp-text-ner-experiment",
    training_data=my_training_data_input,
    validation_data=my_validation_data_input
)

text_ner_job.set_limits(timeout_minutes=60)
text_ner_job.resources = ResourceConfiguration(instance_type="Standard_NC6s_v3")

returned_job = ml_client.jobs.create_or_update(
    text_ner_job
)

print(f"Created job: {returned_job}")

ml_client.jobs.stream(returned_job.name)

但是，当我运行此代码时，它返回以下错误：

Traceback (most recent call last):
  ...
    raise JobException(
azure.ai.ml.exceptions.JobException: Exception : 
 {
    "error": {
        "code": "UserError",
        "message": "Failed to validate user configuration and data.\n 1. The data file does not exists. Ensure data correctness and availability.",
        "message_parameters": {},
        "target": "ValidationService",
        "details": [
            {
                "code": "UserError",
                "severity": 2,
                "message": "The data file does not exists. Ensure data correctness and availability.",
                "message_format": "The data file does not exists. Ensure data correctness and availability.",
                "message_parameters": {
                    "0": "System.Collections.Generic.Dictionary`2[System.String,System.String]"
                },
                "target": "training_data",
                "details": [
                    {
                        "message": "null",
                        "message_parameters": {},
                        "details": []
                    }
                ],
                "inner_error": {
                    "code": "BadArgument",
                    "inner_error": {
                        "code": "ArgumentInvalid",
                        "inner_error": {
                            "code": "DatasetInvalidPath"
                        }
                    }
                }
            }
        ]
    },
    "time": "0001-01-01T00:00:00.000Z"
}

我相当有信心上传的数据对于此类任务而言格式正确，因此这可能是一个可用性问题。

关于如何解决这个问题有什么想法吗？

Answer 1

您需要具有

AssetTypes.MLTABLE

类型的输入数据。它应该如下图所示。

enter image description here

应该有一个名为

MLTable

的文件，里面应该提到数据文件的路径。

检查您的输入文件夹并按如上所示更改它们。

或者，您可以在数据资产中创建 MLTable 类型数据并使用该路径，如下所示。

my_training_data_input = Input(type=AssetTypes.MLTABLE, path="azureml://datastores/workspaceblobstore/paths/my_training_mltable")

my_validation_data_input = Input(type=AssetTypes.MLTABLE, path="azureml://datastores/workspaceblobstore/paths/my_validation_mltable")

创建数据资产：

转到 Data > 单击 create > 选择 source files 选项 > 选择 workspace blobstore 位置 > upload your files 并单击 create。

enter image description here

创建完成后，你会得到如下图所示的路径。在代码中使用该路径。

enter image description here

无法从 Python 脚本创建和运行 Azure ML Text NER 作业

问题描述投票：0回答：1

1个回答

最新问题

无法从 Python 脚本创建和运行 Azure ML Text NER 作业

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1