How does computing table stats in hive or impala speed up queries in Spark SQL?

前端未结
关注
 3  1497
心在旅途 2020-12-28 22:48
For increasing performance (e.g. for joins) it is recommended to compute table statics first.
In Hive I can do::
analyze table  c         

        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   星月不相逢
                                             
                
                
                (楼主)
            
              
              
                2020-12-28 23:29
              

            
            
                        
From what i understand compute stats on impala is the latest implementation and frees you from tuning hive settings.

From official doc:


  If you use the Hive-based methods of gathering statistics, see the
  Hive wiki for information about the required configuration on the Hive
  side. Cloudera recommends using the Impala COMPUTE STATS statement to
  avoid potential configuration and scalability issues with the
  statistics-gathering process.
  
  If you run the Hive statement ANALYZE TABLE COMPUTE STATISTICS FOR
  COLUMNS, Impala can only use the resulting column statistics if the
  table is unpartitioned. Impala cannot use Hive-generated column
  statistics for a partitioned table.


Useful link:
https://www.cloudera.com/documentation/enterprise/5-5-x/topics/impala_perf_stats.html
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复
            
          
        
      

          
 
     
 
        热议问题