Hbase quickly count number of rows

前端未结

关注

 12  1562

Right now I implement row count over ResultScanner like this

for (Result rs = scanner.next(); rs != null; rs = scanner.next()) {
    number++;
}


                      
              相关标签:


      
      
        
          12条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  鱼传尺愫        
                
              
                            
                2020-12-04 13:52
              
            
            
                                                                       
If you're using a scanner, in your scanner try to have it return the least number of qualifiers as possible.  In fact, the qualifier(s) that you do return should be the smallest (in byte-size) as you have available.  This will speed up your scan tremendously.

Unfortuneately this will only scale so far (millions-billions?).  To take it further, you can do this in real time but you will first need to run a mapreduce job to count all rows.

Store the Mapreduce output in a cell in HBase.  Every time you add a row, increment the counter by 1.  Every time you delete a row, decrement the counter.

When you need to access the number of rows in real time, you read that field in HBase.  

There is no fast way to count the rows otherwise in a way that scales.  You can only count so fast.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  你的背包        
                
              
                            
                2020-12-04 13:53
              
            
            
                                                                       
You can use coprocessor what is available since HBase 0.92. See Coprocessor and AggregateProtocol and example
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  自闭症患者        
                
              
                            
                2020-12-04 13:53
              
            
            
                                                                       
Go to Hbase home directory and run this command,

./bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter 'namespace:tablename'

This will launch a mapreduce job and the output will show the number of records existing in  the hbase table.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一生所求        
                
              
                            
                2020-12-04 13:54
              
            
            
                                                                       
Use RowCounter in HBase
RowCounter is a mapreduce job to count all the rows of a table. This is a good utility to use as a sanity check to ensure that HBase can read all the blocks of a table if there are any concerns of metadata inconsistency. It will run the mapreduce all in a single process but it will run faster if you have a MapReduce cluster in place for it to exploit. 

$ hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename>

Usage: RowCounter [options] 
    <tablename> [          
        --starttime=[start] 
        --endtime=[end] 
        [--range=[startKey],[endKey]] 
        [<column1> <column2>...]
    ]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉酒成梦        
                
              
                            
                2020-12-04 13:54
              
            
            
                                                                       
Two ways Worked for me to get count of rows from hbase table with Speed

Scenario #1

If hbase table size is small then login to hbase shell with valid user and execute

>count '<tablename>'


Example 

>count 'employee'

6 row(s) in 0.1110 seconds


Scenario #2

If hbase table size is large,then execute inbuilt RowCounter map reduce job:
Login to hadoop machine with valid user and execute:

/$HBASE_HOME/bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter '<tablename>'


Example:

 /$HBASE_HOME/bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter 'employee'

     ....
     ....
     ....
     Virtual memory (bytes) snapshot=22594633728
                Total committed heap usage (bytes)=5093457920
        org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
                ROWS=6
        File Input Format Counters
                Bytes Read=0
        File Output Format Counters
                Bytes Written=0

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  感情败类        
                
              
                            
                2020-12-04 13:55
              
            
            
                                                                       
Use the HBase rowcount map/reduce job that's included with HBase
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复