我在名为 json_col 的列中有以下数据,用于 databricks 中的数据框产品,该产品还具有其他列。 json_col 的数据如下
html:null ,language:null ,message:null ,product:{"title":"selecteddata","search_alias":{"title":"Home ","value":"kitchen"},,"content":{"all_images":[{"ee":"eeee","name":"front page asdsadaasdasd"},{"dasdas":"sduahdjka","name":"asdsadaasdasd"},{"dasdas":"edkjas","name":"asdsadaasdasd "},{"dasdas":"dakjs","name":"Plumeri asdsadaasdasd Spucktüche"},"dasdas":"dkasjhasdasnd","name":"diasdjaskldnasn"}],"body_text":"dkjasda,"},"climate_pledge_friendly":"No",}
在所有内容中,我只需要选择唯一的内容数据
{"all_images":[{"ee":"eeee","name":"front page asdsadaasdasd"},{"dasdas":"sduahdjka","name":"asdsadaasdasd"},{"dasdas":"edkjas","name":"asdsadaasdasd "},{"dasdas":"dakjs","name":"Plumeri asdsadaasdasd Spucktüche"},"dasdas":"dkasjhasdasnd","name":"diasdjaskldnasn"}],"body_text":"dkjasda,"}
我正在使用以下
from pyspark.sql.functions import col, get_json_object
# Extract a_plus_content as a dictionary
extracted_data = product.withColumn('content_extract', get_json_object('json_col', '$.content'))
# Show the results (optional)
extracted_data.select('content_extract').show()
但是它显示为空。 我们可以提供一些专家建议或其他解决方案来解决上述问题吗
使用
from_json
函数和 schema 从 json 字符串中提取特定部分。检查下面的代码。
product
.withColumn(
"content",
expr("from_json(in, 'product struct<content:string>').product.content")
)
.select("content")
.show(2, False)
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|content |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|{"all_images":[{"ee":"eeee","name":"front page asdsadaasdasd"},{"dasdas":"sduahdjka","name":"asdsadaasdasd"},{"dasdas":"edkjas","name":"asdsadaasdasd "},{"dasdas":"dakjs","name":"Plumeri asdsadaasdasd Spucktüche"},{"dasdas":"dkasjhasdasnd","name":"diasdjaskldnasn"}],"body_text":"dkjasda,"}|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+