获取数据类型不匹配:列过滤器中的不同类型((数组<string>和字符串))

问题描述 投票:0回答:2

在尝试过滤列以检查空数据集时,出现以下类型不匹配错误。

cannot resolve '(`sellers` = '[]')' due to data type mismatch: differing types in '(`sellers` = '[]')' (array<string> and string).;;

我试过下面的代码但它不工作并抛出以上错误:

var sellersDFSelectSellers = sellersDF.select("sellers")
sellersDFSelectSellers = sellersDFSelectSellers.filter(col("sellers") === "[]")
scala apache-spark spark-streaming
2个回答
0
投票

试试

sellersDFSelectSellers.filter(col("sellers") === typedLit(Seq()))

我成功重现了

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions.{col, typedLit}

val spark = SparkSession.builder
  .master("local")
  .appName("Spark app")
  .getOrCreate()

import spark.implicits._

case class MyClass(sellers: Seq[String])
val sellersDF = Seq(MyClass(Seq("a", "b")), MyClass(Seq("c", "d", "e")), MyClass(Seq())).toDS()
sellersDF.show()
//+---------+
//|  sellers|
//+---------+
//|   [a, b]|
//|[c, d, e]|
//|       []|
//+---------+
var sellersDFSelectSellers = sellersDF.select("sellers")
  //org.apache.spark.sql.AnalysisException: cannot resolve '(sellers = '[]')' due to data type mismatch: differing types in '(sellers = '[]')' (array<string> and string)
//sellersDFSelectSellers = sellersDFSelectSellers.filter(col("sellers") === "[]") 
sellersDFSelectSellers = sellersDFSelectSellers.filter(col("sellers") === typedLit(Seq()))
sellersDFSelectSellers.show()
//+-------+
//|sellers|
//+-------+
//|     []|
//+-------+

Spark:不支持的文字类型类 scala.collection.immutable.Nil$ List()


0
投票

这段代码解决了问题。

val sellersDFSelectSellers = sellersDF.select("sellers")
val emptySellers = sellersDFSelectSellers.filter(size(col("sellers")) === 0)
© www.soinside.com 2019 - 2024. All rights reserved.