Hbase quickly count number of rows

前端 未结 12 1579
轮回少年
轮回少年 2020-12-04 13:25

Right now I implement row count over ResultScanner like this

for (Result rs = scanner.next(); rs != null; rs = scanner.next()) {
    number++;
}         


        
12条回答
  •  鱼传尺愫
    2020-12-04 13:52

    If you're using a scanner, in your scanner try to have it return the least number of qualifiers as possible. In fact, the qualifier(s) that you do return should be the smallest (in byte-size) as you have available. This will speed up your scan tremendously.

    Unfortuneately this will only scale so far (millions-billions?). To take it further, you can do this in real time but you will first need to run a mapreduce job to count all rows.

    Store the Mapreduce output in a cell in HBase. Every time you add a row, increment the counter by 1. Every time you delete a row, decrement the counter.

    When you need to access the number of rows in real time, you read that field in HBase.

    There is no fast way to count the rows otherwise in a way that scales. You can only count so fast.

提交回复
热议问题