我有一个 JSON 列表,我正在使用
from_json
阅读它。如何将生成的列类型转换为单列数据框?
from pyspark.sql.functions import from_json
from pyspark.sql.types import ArrayType, StringType
jsonlist = "['a','b','c']"
col = from_json(jsonlist , ArrayType(StringType()))
# how to I create a dataframe?
df = ...
我尝试过的一切结果
类型错误:列不可迭代
例如
spark.createDataFrame([col], ['item'])
这段代码完成了这项工作:
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, StringType
spark = SparkSession.builder \
.appName("JSON to DataFrame") \
.getOrCreate()
jsonlist = [['a'], ['b'], ['c']]
schema = StructType([
StructField("Column1", StringType(), True)
])
df = spark.createDataFrame(jsonlist, schema=schema)
df.show()