我需要使用 Kinesis Data Stream 和 Firehose 流将数据从 DynamoDB 提取到 S3,并将它们转换为 parquet。 我在 Firehose Stream 中设置镶木地板转换时遇到问题,因为我需要选择一个 Glue 表,其中包含进入 Firehose 的数据架构。我已经尝试了很多事情,唯一对我有用的是当数据只是“无”,或者它们不是空的但不是我需要的。 还有另一个选项可以使用 Lambda 将它们转换为镶木地板,但我仍然需要设置此架构。
这些是我进入 Firehose 的数据:
{"awsRegion":"us-west-2","eventID":"ebb59b0b-247d-49cc-83fc-ccc1988481ed","eventName":"INSERT","userIdentity":null,"recordFormat":"application/json","tableName":"table-oleg","dynamodb":{"ApproximateCreationDateTime":1710435429661487,"Keys":{"user_id":{"S":"nvbssdfbdfbsdvsdv"}},"NewImage":{"autonomy":{"S":"ssnvbnbdfbdfsdvsdv"},"user_id":{"S":"nvbssdfbdfbsdvsdv"}},"SizeBytes":74,"ApproximateCreationDateTimePrecision":"MICROSECOND"},"eventSource":"aws:dynamodb"}{"awsRegion":"us-west-2","eventID":"0f30f744-2d94-4494-bd55-e0278942cccb","eventName":"INSERT","userIdentity":null,"recordFormat":"application/json","tableName":"table-oleg","dynamodb":{"ApproximateCreationDateTime":1710435451018854,"Keys":{"user_id":{"S":"nvbssdfbdfbSVSV"}},"NewImage":{"autonomy":{"S":"ssnvbnbdfbdfsEV"},"user_id":{"S":"nvbssdfbdfbSVSV"}},"SizeBytes":67,"ApproximateCreationDateTimePrecision":"MICROSECOND"},"eventSource":"aws:dynamodb"}
我需要将这两列及其值上传到 S3:
autonomy: ssnvbnbdfbdfsEV
user_id: nvbssdfbdfbSVSV
提前感谢您的帮助!
我尝试使用在互联网上找到的许多模式示例。我还尝试了 AWS 官方网站上的示例:
{
"$id": "https://example.com/person.schema.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Person",
"type": "object",
"properties": {
"firstName": {
"type": "string",
"description": "The person's first name."
},
"lastName": {
"type": "string",
"description": "The person's last name."
},
"age": {
"description": "Age in years which must be equal to or greater than zero.",
"type": "integer",
"minimum": 0
}
}
}
但到目前为止还没有成功...
有同样的问题!这真的比应有的困难得多