我正在尝试使用自定义正则表达式信息类型标记字符串值(以表格格式传递),但是当我在表中添加多行时出现问题。如果我通过单行,它将成功标记化string_value并返回编码后的字符串。同样使用python库。
出于演示目的,当前将自定义信息类型设置为字符串中的任何值,并且包装的密钥存在于云KMS中,并且出于安全原因已将其删除。
以下是我正在使用的配置:
# Construct FPE configuration dictionary
crypto_replace_ffx_fpe_config = {
"crypto_key": {
"kms_wrapped": {
"wrapped_key": wrapped_key,
"crypto_key_name": key_name,
}
}
}
# Add surrogate type
if surrogate_type:
crypto_replace_ffx_fpe_config["surrogate_info_type"] = {
"name": surrogate_type
}
# Construct inspect configuration dictionary
inspect_config = {
#"info_types": [{"name": info_type} for info_type in info_types],
#"min_likelihood": "VERY_UNLIKELY",
"custom_info_types": [
{
"info_type": {
"name": "custom"
},
"exclusion_type": "EXCLUSION_TYPE_UNSPECIFIED",
"likelihood": "POSSIBLE",
"regex": {
"pattern": "(?:.*)"
#"pattern": ".*"
}
}
]
}
# Construct deidentify configuration dictionary
deidentify_config = {
"info_type_transformations": {
"transformations": [
{
"primitive_transformation": {
"crypto_deterministic_config": crypto_replace_ffx_fpe_config
}
}
]
}
}
item={
"table":{
"headers":[{
"name":header
} for header in data_headers
],
"rows":[
{
"values":[
{
"string_value":"asa s.com"
}
]
}, #Issue starts when the below row is added having any value in string_value
{
"values":
[
{
"string_value":"[email protected]"
}
]
}
]
}
}
# Call the API
response = dlp.deidentify_content(
parent,
inspect_config=inspect_config,
deidentify_config=deidentify_config,
item=item,
)
# Print results
return response.item.table
如果我正在发送一行数据,则得到响应为>]
headers { name: "email_id" } headers { name: "token" } rows { values { string_value: "EMAIL_ADDRESS(XX):XXXXXXXXXXXXXXXXXXX=" } }
并且我发送的项目多于一行,我收到的是最初发送给api的内容:例如:
headers {
name: "token"
}
rows {
values {
string_value: "asa s.com"
}
}
rows {
values {
string_value: "[email protected]"
}
}
我正在尝试使用自定义正则表达式信息类型标记字符串值(以表格格式传递),但是当我在表中添加多行时出现问题。如果我通过单行,则...
您能描述答复吗?