I have a data frame with four fields. one of the field name is Status and i am trying to use a OR condition in .filter for a dataframe . I tried below queries but no luck.>
This question has been answered but for future reference, I would like to mention that, in the context of this question, the where and filter methods in Dataset/Dataframe supports two syntaxes:
The SQL string parameters:
df2 = df1.filter(("Status = 2 or Status = 3"))
and Col based parameters (mentioned by @David ):
df2 = df1.filter($"Status" === 2 || $"Status" === 3)
It seems the OP'd combined these two syntaxes. Personally, I prefer the first syntax because it's cleaner and more generic.