如何像火花数据帧中的SQL那样实现EXISTS条件

问题描述 投票:-2回答:1

我很好奇,我该如何以Spark Dataframe方式实现sql类似于exist子句。

apache-spark pyspark apache-spark-sql
1个回答
0
投票
LEFT SEMI JOIN等效于Spark中的EXISTS功能。

val cityDF= Seq(("Delhi","India"),("Kolkata","India"),("Mumbai","India"),("Nairobi","Kenya"),("Colombo","Srilanka")).toDF("City","Country")

df1

val CodeDF= Seq(("011","Delhi"),("022","Mumbai"),("033","Kolkata"),("044","Chennai")).toDF("Code","City")

df2

val finalDF= cityDF.join(CodeDF, cityDF("City") === CodeDF("City"), "left_semi")

df3
© www.soinside.com 2019 - 2024. All rights reserved.