我试图在dataframe select语句中的两列之间添加一个空列。
使用withColumn
函数,我只能作为结束列附加,但我需要中间的空列(第3列和第6列),如下所示。
val product1 = product.select("_c1","_c2"," ","_c4", "_c5", "_c5", " ", "c6")
我尝试在withColumn
语句中间使用select
,如下所示,它给出了错误:
val product1 = product.select("_c1","_c2",product.withColumn("NewCol",lit(None).cast("string")),"_c4", "_c5", "_c5", " ", "c6")
>error: overloaded method value select with alternatives:
(col: String,cols: String*)org.apache.spark.sql.DataFrame <and>
(cols: org.apache.spark.sql.Column*)org.apache.spark.sql.DataFrame
cannot be applied to (String, String, String, String, String, String, String, String, org.apache.spark.sql.DataFrame, String)
如果有任何建议,请告诉我。谢谢
为了在数据框中选择列,可以使用字符串(列名称)或列(Column
类型)作为输入。来自documentation:
def select(col: String, cols: String*): DataFrame Selects a set of columns.
def select(cols: Column*): DataFrame Selects a set of column based expressions.
但是,这些不能混合。在这种情况下,请使用带有select
类型的Column
。要获取特定名称的列,请使用col
函数或$
(在importing spark implicits之后)。
val spark = SparkSession()....
import spark.implicits._
val product1 = product.select($"_c1", $"_c2", lit(" ").as("newCol1"), $"_c4", $"_c5", $"_c5", lit(" ").as("newCol2"), $"c6")