我有这个数据框:
df_json = spark.sql(select id, type, json from process.base_json where)
df_json.show()
+--------+--------+---------------------------------------------------------------------------------+
| id | type | json |
+--------+--------+---------------------------------------------------------------------------------+
|23016494|TAX |{"Id":"253","RESULT":{"DATA":{"response":[{"message":"ID 253 invalid"}]}}} |
|23020867|WARRANTY|{"Id":"108","RESULT":{"DATA":{"result":[{"message":"Nomatches"}]}},"Type":"ID"} |
|23021055|WARRANTY|{"Id":"332","RESULT":{"DATA":{"detail":{"cre":"BANK","nid":"332"}]}},"Type":"ID"}|
|23016497|TAX |{"Id":"643","RESULT":{"DATA":{"registry":[{"dv":"5","st":"ACT","name":"MAY"}]}}} |
+-----------+--------------------+------------------------------------------------------------------+
我想为数据框的每一行创建一个新的数据框,以便能够单独解析 json。
df_json1 +--------+--------+---------------------------- ---------------------------------------------- + |编号 |类型 | JSON | +--------+--------+---------------------------- ---------------------------------------------- + |23016494|TAX |{"Id":"253","RESULT":{"DATA":{"response":[{"message":"ID 253 invalid"}]}}} | +--------+--------+---------------------------- ---------------------------------------------- +
df_json2 +--------+--------+---------------------------- ---------------------------------------------- + |编号 |类型 | JSON | +--------+--------+---------------------------- ---------------------------------------------- + |23020867|保修|{"Id":"108","RESULT":{"DATA":{"result":[{"message":"Nomatches"}]}},"Type":"ID"} | +--------+--------+---------------------------- ---------------------------------------------- +
df_json3 +--------+--------+---------------------------- ---------------------------------------------- + |编号 |类型 | JSON | +--------+--------+---------------------------- ---------------------------------------------- + |23021055|保修|{"Id":"332","RESULT":{"DATA":{"detail":{"cre":"BANK","nid":"332"}]}},"输入":"ID"}| +--------+--------+---------------------------- ---------------------------------------------- +
df_json4 +--------+--------+---------------------------- ---------------------------------------------- + |编号 |类型 | JSON | +--------+--------+---------------------------- ---------------------------------------------- + |23016497|TAX |{"Id":"643","RESULT":{"DATA":{"registry":[{"dv":"5","st":"ACT","name": "五月"}]}}} | +------------+--------------------+-------------- ---------------------------------------------- +