Kinesis Firehose 的动态分区

问题描述 投票:0回答:1

我正在尝试在 AWS CLI 上创建一个 Kinesis 传输流,根据字符串 created_at 字段对输出进行分区。我收到以下错误消息:

调用时发生错误(InvalidArgumentException) CreateDeliveryStream 操作:S3 Prefix 应包含 Dynamic 启用动态分区时分区命名空间

我是 AWS 的新手,我正在努力修复错误。有人可以帮忙吗?

我的 JSON 如下所示:

{
    "DeliveryStreamName": "KDS-S3-silver",
    "DeliveryStreamType": "KinesisStreamAsSource",
    "KinesisStreamSourceConfiguration": {
        "KinesisStreamARN": "my-stream",
        "RoleARN": "my-role"
    },
    "ExtendedS3DestinationConfiguration": {
        "RoleARN": "my-role",
        "BucketARN": "arn:aws:s3:::my-bucket-transformed",
        "Prefix": "created_at=!{partitionKeyFromQuery:created_at}/",
        "ErrorOutputPrefix": "error/result=!{firehose:error-output-type}/!{timestamp:yyyy/MM/dd}",
        "BufferingHints" : {
            "IntervalInSeconds" : 60,
            "SizeInMBs" : 64
          },
        "CompressionFormat": "UNCOMPRESSED",
        "ProcessingConfiguration": {
            "Enabled": true,
            "Processors": [
                {
                    "Parameters" : [ {
                        "ParameterName" : "LambdaArn",
                        "ParameterValue" : "lambda-test"
                      }],
                    "Type" : "Lambda"
                  },
                  {
                    "Parameters" : [ {
                        "ParameterName" : "MetadataExtractionQuery",
                        "ParameterValue" : "{created_at:.data.created_at| strptime('%Y-%m-%dT%H:%M:%SZ')| strftime('%Y%m%d')}"
                      },
                      {
                        "ParameterName" : "JsonParsingEngine",
                        "ParameterValue" : "JQ-1.6"
                      }
                    ],
                    "Type" : "MetadataExtraction"
                  }
            ]
        },
        "S3BackupMode": "Enabled",
        "S3BackupConfiguration": {
            "RoleARN": "my-role",
            "BucketARN": "arn:aws:s3:::my-bucket-raw",
            "Prefix": "created_at=!{partitionKeyFromQuery:created_at}/",
            "ErrorOutputPrefix": "error/result=!{firehose:error-output-type}/!{timestamp:yyyy/MM/dd}"
        },
        "DynamicPartitioningConfiguration": {
            "Enabled": true
        }
    }
}   

应该如何指定前缀才能避免错误?

amazon-web-services amazon-kinesis amazon-kinesis-firehose
1个回答
0
投票

您应该已经将 Dynamic Partitioning 命名空间添加到前缀字段,如下所示:

# yours.
"Prefix": "created_at=!{partitionKeyFromQuery:created_at}/",

# right version.
"Prefix": "env=!{partitionKeyFromQuery:<any_column_that_can_be_used_as_namespace>}/created_at=!{partitionKeyFromQuery:created_at}/",

# or maybe you can use "created_at" column as namespace directly.
"env=!{partitionKeyFromQuery:created_at}/"
© www.soinside.com 2019 - 2024. All rights reserved.