联合两个Spark数据帧

问题描述 投票:-1回答:1

我尝试在Python中的两个Spark DataFrame之间建立联合,其中一个有时是空的,我做了一个测试,如果,返回那个完整。以下是一个小代码,它返回一个错误:

>>> from pyspark.sql.types import *
>>> fulldataframe = [StructField("FIELDNAME_1",StringType(), True),StructField("FIELDNAME_2", StringType(), True),StructField("FIELDNAME_3", StringType(), True)]
>>> schema = StructType([])
>>>
>>> dataframeempty = sqlContext.createDataFrame(sc.emptyRDD(), schema)
>>> resultunion = sqlContext.createDataFrame(sc.emptyRDD(), schema)
>>> if (fulldataframe.isEmpty()):
...     resultunion = dataframeempty
... elif (dataframeempty.isEmpty()):
...     resultunion = fulldataframe
... else:
...     resultunion=fulldataframe.union(dataframeempty)
...


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'list' object has no attribute 'isEmpty'
>>>

有人可以告诉我哪里出错了?

python apache-spark
1个回答
0
投票

伯爵需要很长时间。在斯卡拉:

dataframe.rdd.isEmpty()
© www.soinside.com 2019 - 2024. All rights reserved.