正则表达式从 Snowflake 表中的字段中分离出 json 对象和键值

问题描述 投票:0回答:1

我在 Snowflake 的 VARCHAR 字段内有 json 对象,并希望提取每个资源和可以应用于另一列的操作,理想地表示为分隔列表中的键和值,例如 例如

["arn:aws:s3::testfiles"="s3:PutObject","s3:GetObject"],["arn:aws:s3:::testfiles/"="s3:ListAllMyBuckets"]

或者每个资源和操作组的多个列也可以工作,但这需要是动态的,因为每个 json 对象中可以有多个且数量不同的资源和操作。

我的问题是如何在 Snowflake 中执行此操作?

我想我可以将字段文本转换为 json 对象列表,然后解析 json 并仅提取我感兴趣的字段,但我无法让正则表达式仅删除 json 对象名称,使其变得有效json。我尝试使用通配符,如下所示,但它删除了太多内容,并认为它与大括号有关,但尝试转义它,但仍然无法像我预期的那样工作。

REGEXP_REPLACE(merged_json,'S3_.*{', '{')

感谢您的帮助。

字段值示例。我以粗体突出显示了我试图替换的值,以将其转换为 json 对象列表。每一个都以 S3_

为前缀

S3_TestFile{"版本": "2024-01-17","声明": [{"Sid": "1234","效果": "允许","操作": ["s3:"],"资源": [“arn:aws:s3 :: testfiles”]},{“Sid”:“AllowUserAction”,“效果”: “允许”、“操作”: [“s3:PutObject”,“s3:GetObject”,“s3:GetObjectVersion”,“s3:DeleteObject”,“s3:DeleteObjectVersion”,“s3:GetObjectAcl”],“资源”: “arn:aws:s3:::testfiles/”}]} S3_ProdAccess{“版本”: "2024-01-17","声明": [{"Sid": "AllowGroupToList","操作": [“s3:ListAllMyBuckets”,“s3:GetBucketLocation”],“效果”: "允许","资源": ["arn:aws:s3:::"]},{"Sid": “AllowListingOfTheBucket”,“操作”:[“s3:ListBucket”],“效果”: "允许","资源": ["arn:aws:s3:::2020files"],"条件": {"StringEquals": {"s3:prefix": [""],"s3:delimiter": ["/"]}}},{"Sid": "AllowListBucket","操作": ["s3:ListBucket"],"效果": "允许","资源": ["arn:aws:s3:::2020files"],"条件": {“StringLike”:{“s3:前缀”:[“”]}}},{“Sid”: "AllowUserWithinBucket","效果": "允许","操作": ["s3:*"],"资源": "arn:aws:s3:::2020files/**"}]}S3_Policy_End

regex snowflake-cloud-data-platform regexp-replace
1个回答
0
投票

我为您提供了 powerquery 解决方案,但您删除了该问题。

无论如何,这将您的文本转换为

let

Expand = (xx as list) as table=>
let #"Converted to Table" = Table.FromList(xx, Splitter.SplitByNothing(), null, null, ExtraValues.Error) ,
ExpandList= List.Distinct(List.Combine(List.Transform(Table.Column( #"Converted to Table", "Column1"), each if _ is record then Record.FieldNames(_) else {})))
in  Table.ExpandRecordColumn( #"Converted to Table", "Column1", ExpandList,ExpandList),

Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Replaced Value" = Table.ReplaceValue(Source,"{""Version"":","+++{""Version"":",Replacer.ReplaceText,{"Column1"}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"Column1", Splitter.SplitTextByDelimiter("+++", QuoteStyle.None), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
#"Added Custom" = Table.AddColumn(#"Split Column by Delimiter", "Custom", each 
 let a=[Column1],
 b = Text.Reverse(Text.AfterDelimiter(Text.Reverse(a)," }]}"))
 in if b ="" then a else b&"}]}"),
Convert =Expand(Json.Document("["&Text.Combine(Table.Skip( #"Added Custom",1)[Custom],",")&"]")),
#"Added Custom1" = Table.AddColumn(Convert, "Custom", each Expand([Statement])),
ColumnsToExpand = List.Difference(List.Distinct(List.Combine(List.Transform(Table.Column( #"Added Custom1", "Custom"), each if _ is table then Table.ColumnNames(_) else {}))),{"Name"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom1", "Custom",  ColumnsToExpand, ColumnsToExpand),
#"Lowercased Text" = Table.TransformColumns(#"Expanded Custom",{{"Resource", each try Text.Combine(_,",") otherwise _, type text}}),
#"Expanded Action" = Table.ExpandListColumn(#"Lowercased Text", "Action")
in #"Expanded Action"
© www.soinside.com 2019 - 2024. All rights reserved.