如何使用 JOLT v0.1.1 转换 JSON 中的日期值

问题描述 投票:0回答:1

我有这个 JSON 输入:

{
  "entityType": "person",
  "id": 24285,
  "properties": {
    "firstName": "Zia",
    "lastName": "Rus",
    "email": "[email protected]",
    "phoneNumber": null,
    "requestingUserGuid": "Cugxxasod Daspohs, Uyowiye",
    "generalReportReference": 244,
    "employment": [
      {
        "employerName": "Avature",
        "startDate": "01-Jan-2003",
        "endDate": "04-Nov-2008",
        "jobTitle": ""
      },
      {
        "employerName": "Avature",
        "startDate": "06-Jul-2012",
        "endDate": "",
        "jobTitle": ""
      }
    ],
    "supervisor": "Supervisor",
    "education": [
      {
        "institutionName": "Universitate",
        "major": "gead",
        "degree": "gased",
        "startDate": "01-Feb-2022",
        "endDate": "01-Oct-2024"
      },
      {
        "institutionName": "College",
        "major": "brup",
        "degree": "brup",
        "startDate": "05-Apr-2016",
        "endDate": ""
      }
    ],
    "externalIdentifier": 24285,
    "clientGuid": "6253d1c2-b02f-4513-8348-89db9b8ba449",
    "productGuid": "e2f507b4-025c-439b-9ddb-833f9e537e60",
    "applicantGuid": ""
  }
}

到目前为止,我已经成功地根据需要进行了整个数据转换,但我缺少将数据字段格式化为“YYYY-MM-DD”格式的方法。 我见过格式化数据字段的方法,但大多数都是通过默认值或覆盖数据。 这些解决方案的问题在于,正如您在输入 JSON 中看到的那样,我可以在“employment”和“education”数组中包含一个或多个对象,并且可以包含“EndDate”,也可以不包含“EndDate”。因此,使用“默认数据”解决方案对我不起作用。

这是我目前用来转换数据的 JOLT:

[
  {
    "operation": "shift",
    "spec": {
      "properties": {
        "firstName": "firstName",
        "lastName": "lastName",
        "email": "email",
        "phoneNumber": "phoneNumber",
        "requestingUserGuid": "requestingUserGuid",
        "generalReportReference": "generalReportReference",
        "clientGuid": "clientGuid",
        "productId": "productId",
        "applicantGuid": "applicantGuid",
        "externalIdentifier": "externalIdentifier",
        "employment": {
          "*": {
            "employerName|endDate": {
              "": null,
              "*": {
                "@1": "&4[&3].&2"
              }
            },
            "*": {
              "*": {
                "@1": "&4[&3].&2"
              }
            }
          }
        },
        "education": {
          "*": {
            "institutionName|registrarPhone|endDate": {
              "": null,
              "*": {
                "@1": "&4[&3].&2"
              }
            },
            "*": {
              "*": {
                "@1": "&4[&3].&2"
              }
            }
          }
        }
      },
      "education": {
        "*": {
          "~institutionName": "N/A",
          "~registrarPhone": "(111) 111-1111"
        }
      },
      "employment": {
        "*": {
          "~employerName": "N/A"
        }
      }
    }
    },
  {
    "operation": "modify-default-beta",
    "spec": {
      "employment": {
        "*": {
          "employerName": "N/A",
          "firstNameUsed": "@(4,firstName)",
          "lastNameUsed": "@(4,lastName)"
        }
      },
      "education": {
        "*": {
          "institutionName": "N/A",
          "registrarPhone": "(111) 111-1111"
        }
      }
    }
    },
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "employment": {
        "*": {
          "supervisor": "Supervisor"
        }
      }
    }
  }
]

我看不到如何将“日期”字段转换为所需的格式。 有人可以帮我解决这个问题吗? 这可以通过 JOLT 实现吗?或者我应该采取不同的方法并尝试在 NiFi 中使用另一个处理器?

提前致谢!

json apache-nifi jolt
1个回答
0
投票

您可以使用以下转换规范处理 JoltTransformJSON

 处理器中的所有 
date 排列,其中 modify 规范的 split 函数将所有 date 值转换为 arrays 以在其中使用shift,其中月份缩写与其后面的数字表示相匹配,然后再次通过 modify 规范的 join 函数组合日期数组的组成部分,例如

[
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "properties": {
        "*": {
          "*": {
            "*Date": "=split('-',@(1,&))"
          }
        }
      }
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": "&", //the elements other than "properties"
      "properties": {
        "*": "&1.&", //the elements other than "employment" and "education"
        "emp*|edu*": {
          "*": {
            "*": "&3.&2[&1].&", //the elements other than "...Dates", reduce the levels by 1 w.r.t. the line 2 levels down
            "*Date": {
              "2": "&4.&3[&2].&1", //[&2] represents the indexes of the "employment" or "education" in order to provide array manner back 
              "1": { //Months part
                "Jan": { "#01": "&6.&5[&4].&3" }, //leaf nodes are 2 levels deeper than the other Date arrays' nodes  
                "Feb": { "#02": "&6.&5[&4].&3" },
                "Mar": { "#03": "&6.&5[&4].&3" },
                "Apr": { "#04": "&6.&5[&4].&3" },
                "May": { "#05": "&6.&5[&4].&3" },
                "Jun": { "#06": "&6.&5[&4].&3" },
                "Jul": { "#07": "&6.&5[&4].&3" },
                "Aug": { "#08": "&6.&5[&4].&3" },
                "Sep": { "#09": "&6.&5[&4].&3" },
                "Oct": { "#10": "&6.&5[&4].&3" },
                "Nov": { "#11": "&6.&5[&4].&3" },
                "Dec": { "#12": "&6.&5[&4].&3" }
              },
              "0": "&4.&3[&2].&1"
            }
          }
        }
      }
    }
  },
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "properties": {
        "*": {
          "*": {
            "*Date": "=join('-',@(1,&))"
          }
        }
      }
    }
  }
]
© www.soinside.com 2019 - 2024. All rights reserved.