I have two tables with a single key column. Keys in table a are subset of all keys in table b. I need to select keys from table b that are NOT in table a.
Here is a
I tried left semi join for IN function in cdh 5.7.0 with spark 1.6 version.
The semi left join gives wrong results, which is not similar to IN function in sub queries.