How to use NOT IN clause in filter condition in spark

前端 未结 3 1929
一整个雨季
一整个雨季 2021-02-04 06:09

I want to filter a column of an RDD source :

val source = sql(\"SELECT * from sample.source\").rdd.map(_.mkString(\",\"))
val destination = sql(\"select * from          


        
3条回答
  •  青春惊慌失措
    2021-02-04 07:05

    You can try something similar in Java,

    ds = ds.filter(functions.not(functions.col(COLUMN_NAME).isin(exclusionSet)));
    

    where exclusionSet is a set of objects that needs to be removed from your dataset.

提交回复
热议问题