使用 Talend Open Studio 解析嵌套 JSON 数组并将其设为表格

问题描述 投票:0回答:1

我需要解析 Json 嵌套文件并使用 Talend Open Studio 提取 csv。目标是将嵌套的 Json 格式转换为表格。

Json 具有以下结构:

它是一个元素数组(在我的具体情况下为金融工具),每个元素都有另一个数组级别,代表每个工具的交易。

在下面的示例中,我们有三个元素(由字段 IsinDescription 表示),每个元素可能有一组 transactionDetails。

[{
        "transactionDetails": [
            {
                "tradeDate": "2023-02-13T00:00:00",
                "price": "90",
                "nominalAmount": 26000000.0
            },
            {
                "tradeDate": "2023-02-13T00:00:00",
                "price": "95",
                "nominalAmount": 1000000.0
            },
            {
                "tradeDate": "2023-02-13T00:00:00",
                "price": "97",
                "nominalAmount": 30000000.0
            }
        ],
        "Description": "Apple",
        "isin": "ISIN1"
    },
{

        "transactionDetails": [
            {
                "tradeDate": "2023-02-13T00:00:00",
                "price": "88",
                "nominalAmount": 27000000.0
            },
            {
                "tradeDate": "2023-02-13T00:00:00",
                "price": "99",
                "nominalAmount": 1000000.0
            },
            {
                "tradeDate": "2023-02-13T00:00:00",
                "price": "96",
                "nominalAmount": 24000000.0
            }
        ],
        "Description": "Microsoft",
        "isin": "ISIN2"
    },
{
        "Description": "Tesla",
        "isin": "ISIN3"
    }]

理想的输出应该列出每个 Isin 和引用日期以及所有交易详细信息(每个 Isin 三个)。下表代表了我的意思:

isin 描述 交易日期 价格 名义金额
ISIN1 苹果 2023-02-13T00:00:00 90 26000000.0
ISIN1 苹果 2023-02-09T00:00:00 95 1000000.0
ISIN1 苹果 2023-02-13T00:00:00 97 30000000.0
ISIN2 微软 2023-02-13T00:00:00 88 27000000.0
ISIN2 微软 2023-02-13T00:00:00 99 1000000.0
ISIN2 微软 2023-02-13T00:00:00 96 24000000.0
ISIN3 特斯拉 - - -

重要提示:根据示例,并非所有工具都有与之关联的 transactionDetails,但我也需要将它们提取到表中(当然 transactionDetails 字段中为空值)。

我尝试了不同的方法。

1-第一个是创建 Json 元数据,我尝试了以下方法,但从未得到所需的结果,在我的尝试中的屏幕截图中:

$[*] 绝对路径表达式

使用 $[].transactionDetails[*] 绝对路径表达式

如您所见,“根值”(Isin、描述)或 transactionDetails 被写入提取字段中的数组,而不是按照我之前的示例作为表进行处理。我在设置绝对或相对路径表达式时犯了错误吗?

2 - 然后我尝试使用 tExtractJsonfields,实际上我成功地做了我想做的事情,但方式非常复杂:

在此示例中,我使用不同的文件,但具有相同的 Json 结构。
这里我有两个相同的 tInputJsonFile,其中“$[*]”作为绝对路径表达式,相对路径表达式如下
- “伊辛”
- “描述”
- “交易详情[*]”

然后在第一个 tExtractJsonFields 中,我提取 Isin,Description 和 transactionDetails 保留为内部有数组的字段。在第二个 tExtractJsonFields 中,我循环 transactionDetails 以提取 tradeDate、价格和nominalAmount(但这只会提取与 transactionDetails 关联的记录,而不是其他记录)。

因此,最后为了连接所有工具(带有 transacionDetails 的工具和不带有 transacionDetails 的工具),我必须创建两个 tMap(以使两个输出数据集具有完全相同的列),然后将它们与 tUnite 组件连接起来。

有没有更简单直观的方法来达到预期的效果?

java arrays json csv talend
1个回答
0
投票

您可以尝试另一个 JSON 库 Josson 将 JSON 转换为 csv。

https://github.com/octomix/josson

反序列化

Josson josson = Josson.fromJsonString(
    "[" +
    "    {" +
    "        \"transactionDetails\": [" +
    "            {" +
    "                \"tradeDate\": \"2023-02-13T00:00:00\"," +
    "                \"price\": \"90\"," +
    "                \"nominalAmount\": 26000000.0" +
    "            }," +
    "            {" +
    "                \"tradeDate\": \"2023-02-13T00:00:00\"," +
    "                \"price\": \"95\"," +
    "                \"nominalAmount\": 1000000.0" +
    "            }," +
    "            {" +
    "                \"tradeDate\": \"2023-02-13T00:00:00\"," +
    "                \"price\": \"97\"," +
    "                \"nominalAmount\": 30000000.0" +
    "            }" +
    "        ]," +
    "        \"Description\": \"Apple\"," +
    "        \"isin\": \"ISIN1\"" +
    "    }," +
    "    {" +
    "        \"transactionDetails\": [" +
    "            {" +
    "                \"tradeDate\": \"2023-02-13T00:00:00\"," +
    "                \"price\": \"88\"," +
    "                \"nominalAmount\": 27000000.0" +
    "            }," +
    "            {" +
    "                \"tradeDate\": \"2023-02-13T00:00:00\"," +
    "                \"price\": \"99\"," +
    "                \"nominalAmount\": 1000000.0" +
    "            }," +
    "            {" +
    "                \"tradeDate\": \"2023-02-13T00:00:00\"," +
    "                \"price\": \"96\"," +
    "                \"nominalAmount\": 24000000.0" +
    "            }" +
    "        ]," +
    "        \"Description\": \"Microsoft\"," +
    "        \"isin\": \"ISIN2\"" +
    "    }," +
    "    {" +
    "        \"Description\": \"Tesla\"," +
    "        \"isin\": \"ISIN3\"" +
    "    }" +
    "]");

转型

String csv = josson.getString(
    "unwind(+transactionDetails)" +
    ".map(+Description, +isin, +tradeDate, +price, +nominalAmount)@" +
    ".csv()" +
    ".@join('\n')");
System.out.print(csv);

输出

Apple,ISIN1,2023-02-13T00:00:00,90,2.6E7
Apple,ISIN1,2023-02-13T00:00:00,95,1000000.0
Apple,ISIN1,2023-02-13T00:00:00,97,3.0E7
Microsoft,ISIN2,2023-02-13T00:00:00,88,2.7E7
Microsoft,ISIN2,2023-02-13T00:00:00,99,1000000.0
Microsoft,ISIN2,2023-02-13T00:00:00,96,2.4E7
Tesla,ISIN3,,,
© www.soinside.com 2019 - 2024. All rights reserved.