我们使用 AWS Athena 表来存储订单和产品信息。我们在订单表中名为 line-items 的列中有一个 json 字符串。它不是有效的 json,我们正在转换为有效的 json,然后解析它以将行项目中的不同产品显示为最终结果集中的单独行。
在某些情况下,json 解析失败,因为我们发现 json 文档存储在 line-items 列中。由于记录很少,整个结果集都会抛出错误。请建议一个替代解决方案,以便忽略无效 json 的 json 解析,并为其返回 null,同时返回成功的行。
请查找以下查询:
当我们使用函数 json_parse() 时失败
with orders_info AS (
select id as order_id, created_at, substring(created_at, 1, 10) as order_date,
customer_id, customer_email, customer_phone, billing_address_country, line_items, total_price_set_shop_money_amount,
row_number() over(partition by id order by etl_run_date asc) AS rn
from orders where cast(substring(created_at, 1, 10) as date) = CURRENT_DATE - INTERVAL '2' DAY
and id in ('134', '4545')
),
orders_dataset as (
select *, replace(replace(replace(replace(line_items, 'None', '''none'''), 'True', 'true'), 'False', 'false'), '''', '"') as line_items_json
from orders_info where rn = 1
),
line_items_dataset as (
select od.*, json_extract_scalar(m, '$.id') product_id,
json_extract_scalar(m, '$.variant_id') variant_id,
json_extract_scalar(m, '$.price_set.shop_money.amount') price_set_shop_money_amount,
json_extract_scalar(m, '$.price_set.shop_money.currency_code') price_set_shop_money_currency_code
from orders_dataset od,
unnest(cast(json_parse(line_items_json) as array(json))) as t(m)
)
select * from line_items_dataset