Why does join fail with “java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]”?

前端 未结 4 1435
Happy的楠姐
Happy的楠姐 2020-11-30 19:33

I am using Spark 1.5.

I have two dataframes of the form:

scala> libriFirstTable50Plus3DF
res1: org.apache.spark.sql.DataFrame = [basket_id: string         


        
4条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-11-30 19:57

    In my case, it was caused by a broadcast over a large dataframe:

    df.join(broadcast(largeDF))
    

    So, based on the previous answers, I fixed it by removing the broadcast:

    df.join(largeDF)
    

提交回复
热议问题