How to implement EXISTS condition as like SQL in spark Dataframe

社会主义新天地 提交于 2020-01-17 17:15:03

问题


I am curious to know, how can i implement sql like exists clause in spark Dataframe way.


回答1:


LEFT SEMI JOIN is equivalent to the EXISTS function in Spark.

val cityDF= Seq(("Delhi","India"),("Kolkata","India"),("Mumbai","India"),("Nairobi","Kenya"),("Colombo","Srilanka")).toDF("City","Country")

val CodeDF= Seq(("011","Delhi"),("022","Mumbai"),("033","Kolkata"),("044","Chennai")).toDF("Code","City")

val finalDF= cityDF.join(CodeDF, cityDF("City") === CodeDF("City"), "left_semi")




回答2:


If the data to be compared is small like a broadcasted list then you can use -

df.filter(col("columnName").isin(list...) === true)



来源:https://stackoverflow.com/questions/59626430/how-to-implement-exists-condition-as-like-sql-in-spark-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!