Why does join fail with “java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]”?

前端未结

关注

 4  1445

Happy的楠姐 2020-11-30 19:33

I am using Spark 1.5.

I have two dataframes of the form:

scala> libriFirstTable50Plus3DF
res1: org.apache.spark.sql.DataFrame = [basket_id: string


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   情书的邮戳
                                             
                
                
                (楼主)
            
              
              
                2020-11-30 20:04
              

            
            
                        
This happens because Spark tries to do Broadcast Hash Join and one of the DataFrames is very large, so sending it consumes much time.

You can:


Set higher spark.sql.broadcastTimeout to increase timeout - spark.conf.set("spark.sql.broadcastTimeout",  newValueForExample36000)
persist() both DataFrames, then Spark will use Shuffle Join - reference from here


PySpark

In PySpark, you can set the config when you build the spark context in the following manner:

spark = SparkSession
  .builder
  .appName("Your App")
  .config("spark.sql.broadcastTimeout", "36000")
  .getOrCreate()

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复