从 databricks 查询中取消转义 JSON 输出

问题描述 投票:0回答:1

下面是我在 databricks SQL 编辑器中运行的 SQL 查询:

SELECT 
    orders.GroceryStore,
    TO_JSON(COLLECT_LIST(MAP('CustomerID', orders.CustomerID,'DiscountValue', orders.DiscountValue,'SalesAmount', orders.SalesAmount,'ChargesInfo', nested_json.ChargeDetails))) AS JsonLine
FROM (
    SELECT h.GroceryStore, d.CustomerID, SUM(d.DiscountValue) AS DiscountValue, SUM(d.SalesAmount) AS SalesAmount
    FROM SalesHeader h
    LEFT JOIN SalesDetail d ON h.CustomerID = d.CustomerID
    WHERE h.Date >= '2024-01-01'
    GROUP BY h.GroceryStore, d.CustomerID
) AS orders
LEFT JOIN (
    SELECT 
        d.CustomerID,
        TO_JSON(COLLECT_LIST(MAP(
            'ChargeType', d.ChargeType, 
            'ChargeAmount', d.ChargeAmount,
            'PaidAmount', d.PaidAmount
            ))) AS ChargeDetails
    FROM ChargesDetail d
    GROUP BY d.CustomerID
) AS nested_json ON orders.CustomerID = nested_json.CustomerID
GROUP BY orders.GroceryStore;

当我运行查询时,这是我得到的输出: enter image description here

根据输出,

\
中的字段有转义字符
ChargeInfo
。无论如何,我可以修改 SQL 查询以使输出不包含转义字符吗?期望的输出是:

[{"CustomerID":"0001ABC","DiscountValue":"126.33","SalesAmount":"2320.26","ChargesInfo":[{"ChargeType":"01","ChargeAmount":"20.26","PaidAmount":"11.22"}]}]

请注意,期望输出中的

""
数组也没有
ChargeInfo
。 任何帮助或建议将不胜感激!

sql azure-databricks to-json
1个回答
0
投票

我已经更正了内容语法:

样本数据:

[{"CustomerID":"0001ABC","DiscountValue":"26.25","SalesAmount":"300.0","ChargesInfo":"[{\"ChargeType\":\"01\",\"ChargeAmount\":\"10.26\",\"PaidAmount\":\"5.62\"}]"}]]
[{"CustomerID":"0002XYZ","DiscountValue":"5.25","SalesAmount":"150.0","ChargesInfo":"[{\"ChargeType\":\"02\",\"ChargeAmount\":\"20.5\",\"PaidAmount\":\"15.75\"}]"}]]

我尝试过以下方法:

SELECT 
    orders.GroceryStore,
    CONCAT(
        '[',
        CONCAT_WS(
            ',',
            COLLECT_LIST(
                CONCAT(
                    '{"CustomerID":"', orders.CustomerID, '"',
                    ',"DiscountValue":"', orders.DiscountValue, '"',
                    ',"SalesAmount":"', orders.SalesAmount, '"',
                    ',"ChargesInfo":', REPLACE(nested_json.ChargeDetails, '\\\\"', '"'),
                    '}'
                )
            )
        ),
        ']'
    ) AS JsonLine
FROM (
    SELECT
        h.GroceryStore,
        d.CustomerID,
        SUM(d.DiscountValue) AS DiscountValue,
        SUM(d.SalesAmount) AS SalesAmount
    FROM SalesHeader h
    LEFT JOIN SalesDetail d ON h.CustomerID = d.CustomerID
    WHERE h.Date >= '2024-01-01'
    GROUP BY h.GroceryStore, d.CustomerID
) AS orders
LEFT JOIN (
    SELECT
        d.CustomerID,
        CONCAT(
            '[',
            CONCAT_WS(
                ',',
                COLLECT_LIST(
                    CONCAT(
                        '{"ChargeType":"', d.ChargeType, '"',
                        ',"ChargeAmount":"', d.ChargeAmount, '"',
                        ',"PaidAmount":"', d.PaidAmount, '"',
                        '}'
                    )
                )
            ),
            ']'
        ) AS ChargeDetails
    FROM ChargesDetail d
    GROUP BY d.CustomerID
) AS nested_json ON orders.CustomerID = nested_json.CustomerID
GROUP BY orders.GroceryStore;

REPLACE
函数用于将\"序列(转义双引号)替换为",有效地删除转义字符\

结果:

[{"CustomerID":"0001ABC","DiscountValue":"26.25","SalesAmount":"300.0","ChargesInfo":[{"ChargeType":"01","ChargeAmount":"10.26","PaidAmount":"5.62"}]}]
[{"CustomerID":"0002XYZ","DiscountValue":"5.25","SalesAmount":"150.0","ChargesInfo":[{"ChargeType":"02","ChargeAmount":"20.5","PaidAmount":"15.75"}]}]
© www.soinside.com 2019 - 2024. All rights reserved.