Hbase quickly count number of rows

前端 未结 12 1584
轮回少年
轮回少年 2020-12-04 13:25

Right now I implement row count over ResultScanner like this

for (Result rs = scanner.next(); rs != null; rs = scanner.next()) {
    number++;
}         


        
12条回答
  •  夕颜
    夕颜 (楼主)
    2020-12-04 13:51

    You can use the count method in hbase to count the number of rows. But yes, counting rows of a large table can be slow.count 'tablename' [interval]

    Return value is the number of rows.

    This operation may take a LONG time (Run ‘$HADOOP_HOME/bin/hadoop jar hbase.jar rowcount’ to run a counting mapreduce job). Current count is shown every 1000 rows by default. Count interval may be optionally specified. Scan caching is enabled on count scans by default. Default cache size is 10 rows. If your rows are small in size, you may want to increase this parameter.

    Examples:

    hbase> count 't1'
    
    hbase> count 't1', INTERVAL => 100000
    
    hbase> count 't1', CACHE => 1000
    
    hbase> count 't1', INTERVAL => 10, CACHE => 1000
    

    The same commands also can be run on a table reference. Suppose you had a reference to table 't1', the corresponding commands would be:

    hbase> t.count
    
    hbase> t.count INTERVAL => 100000
    
    hbase> t.count CACHE => 1000
    
    hbase> t.count INTERVAL => 10, CACHE => 1000
    

提交回复
热议问题