如何使用 Firehose 将流数据转换为镶木地板,为流数据创建粘合表模式?

问题描述 投票:0回答:1

我需要使用 Kinesis Data Stream 和 Firehose 流将数据从 DynamoDB 提取到 S3,并将它们转换为 parquet。 我在 Firehose Stream 中设置镶木地板转换时遇到问题,因为我需要选择一个 Glue 表,其中包含进入 Firehose 的数据架构。我已经尝试了很多事情,唯一对我有用的是当数据只是“无”,或者它们不是空的但不是我需要的。 还有另一个选项可以使用 Lambda 将它们转换为镶木地板,但我仍然需要设置此架构。

这些是我进入 Firehose 的数据:

{"awsRegion":"us-west-2","eventID":"ebb59b0b-247d-49cc-83fc-ccc1988481ed","eventName":"INSERT","userIdentity":null,"recordFormat":"application/json","tableName":"table-oleg","dynamodb":{"ApproximateCreationDateTime":1710435429661487,"Keys":{"user_id":{"S":"nvbssdfbdfbsdvsdv"}},"NewImage":{"autonomy":{"S":"ssnvbnbdfbdfsdvsdv"},"user_id":{"S":"nvbssdfbdfbsdvsdv"}},"SizeBytes":74,"ApproximateCreationDateTimePrecision":"MICROSECOND"},"eventSource":"aws:dynamodb"}{"awsRegion":"us-west-2","eventID":"0f30f744-2d94-4494-bd55-e0278942cccb","eventName":"INSERT","userIdentity":null,"recordFormat":"application/json","tableName":"table-oleg","dynamodb":{"ApproximateCreationDateTime":1710435451018854,"Keys":{"user_id":{"S":"nvbssdfbdfbSVSV"}},"NewImage":{"autonomy":{"S":"ssnvbnbdfbdfsEV"},"user_id":{"S":"nvbssdfbdfbSVSV"}},"SizeBytes":67,"ApproximateCreationDateTimePrecision":"MICROSECOND"},"eventSource":"aws:dynamodb"}

我需要将这两列及其值上传到 S3:

autonomy: ssnvbnbdfbdfsEV
user_id: nvbssdfbdfbSVSV

These are my Firehose stream settings for conversion to Parquet

This is my JSON schema in the Glue table

These are my Glue table settings

提前感谢您的帮助!

我尝试使用在互联网上找到的许多模式示例。我还尝试了 AWS 官方网站上的示例:

{
    "$id": "https://example.com/person.schema.json",
    "$schema": "http://json-schema.org/draft-07/schema#",
    "title": "Person",
    "type": "object",
    "properties": {
        "firstName": {
            "type": "string",
            "description": "The person's first name."
        },
        "lastName": {
            "type": "string",
            "description": "The person's last name."
        },
        "age": {
            "description": "Age in years which must be equal to or greater than zero.",
            "type": "integer",
            "minimum": 0
        }
    }
}

但到目前为止还没有成功...

amazon-web-services amazon-dynamodb aws-glue parquet amazon-kinesis-firehose
1个回答
0
投票

有同样的问题!这真的比应有的困难得多

© www.soinside.com 2019 - 2024. All rights reserved.