Sagemaker Monitor - 监控数据集格式为 gz

问题描述 投票:0回答:1

我创建了一个监控计划来监控批量转换作业的预测。当

dataset_format
中的输入
BatchTransformInput
为 csv 时,计划运行良好。但是,我的批处理作业是采用 gz 格式作为输入的工作流程的一部分。

文档表明

MonitoringDatasetFormat
仅支持csv、json和parquet,我可以将其定义为gz吗?

from sagemaker.model_monitor import DefaultModelMonitor
from sagemaker.model_monitor import CronExpressionGenerator
from sagemaker.model_monitor import BatchTransformInput
from sagemaker.model_monitor import MonitoringDatasetFormat
from time import gmtime, strftime

my_monitor= DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type="ml.m5.xlarge",
    volume_size_in_gb=20,
    max_runtime_in_seconds=3600,
)

my_monitor.create_monitoring_schedule(
monitor_schedule_name=mon_schedule_name,

    # Inputs to run the monitoring schedule on the batch transform
    batch_transform_input=BatchTransformInput(
        data_captured_destination_s3_uri=s3_capture_upload_path,      
        destination="/opt/ml/processing/input",
        dataset_format=MonitoringDatasetFormat.csv(header=False),
    ),
    output_s3_uri=s3_report_path,
    statistics=statistics_path,
    constraints=constraints_path,
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,

)

python amazon-sagemaker mlops amazon-sagemaker-clarify
1个回答
0
投票

默认型号显示器仅支持这些格式。我认为您可以进行后处理,将 gz 格式更改为这些格式之一。请参阅下面的链接进行后期处理 - https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-pre-and-post-processing.html

© www.soinside.com 2019 - 2024. All rights reserved.