I am using Spark 1.5.
I have two dataframes of the form:
scala> libriFirstTable50Plus3DF res1: org.apache.spark.sql.DataFrame = [basket_id: string
In my case, it was caused by a broadcast over a large dataframe:
df.join(broadcast(largeDF))
So, based on the previous answers, I fixed it by removing the broadcast:
df.join(largeDF)