hbase | 易学教程

Count number of records in a column family in an HBase table

阅读更多关于 Count number of records in a column family in an HBase table

问题 I'm looking for an HBase shell command that will count the number of records in a specified column family. I know I can run: echo "scan 'table_name'" | hbase shell | grep column_family_name | wc -l however this will run much slower than the standard counting command: count 'table_name' , CACHE => 50000 (because the use of the CACHE=>50000) and worse - it doesn't return the real number of records, but something like the total number of cells (if I'm not mistaken?) in the specified column

Count number of records in a column family in an HBase table

阅读更多关于 Count number of records in a column family in an HBase table

问题 I'm looking for an HBase shell command that will count the number of records in a specified column family. I know I can run: echo "scan 'table_name'" | hbase shell | grep column_family_name | wc -l however this will run much slower than the standard counting command: count 'table_name' , CACHE => 50000 (because the use of the CACHE=>50000) and worse - it doesn't return the real number of records, but something like the total number of cells (if I'm not mistaken?) in the specified column

thrift hbase client - support filters and coprocessors

阅读更多关于 thrift hbase client - support filters and coprocessors

问题 Sadly, My hbase client language is Python, I am using happybase for now which is based on thrift AFAIK. I know thrift so far is still not supporting filters, coprocessors (correct me if I am wrong here). Can some one point me any Jira items I can track the plan/progress if there is one? The only ones I can find is from "Hbase in Action": “Thrift server to match the new Java API”: https://issues.apache.org/jira/browse/HBASE-1744 “Make Endpoint Coprocessors Available from Thrift”: https:/

Encrypt HBase at-rest data in Cloud

阅读更多关于 Encrypt HBase at-rest data in Cloud

问题 I am pretty new to HBase and have been assigned a task to move our infrastructure to cloud. Our HBase data contains some customer information and hence needs to be encrypted while at-rest. I am already reading this: Transparent Encryption of Data At Rest (http://hbase.apache.org/book/ch08s03.html#hbase.encryption.server) It looks like a good solution except the fact that we have to store the password as plain text on each node. Is there a way to avoid this? Like store the password at just one

How to get the region in HBASE which is struck in FAILED_OPEN state?

阅读更多关于 How to get the region in HBASE which is struck in FAILED_OPEN state?

问题 Hbase hbck runs successfully and there is no inconsistency, but out of three regions which struck in transition state ( 2 out of 3 is in CLOSED state and 1 is in FAILED_OPEN) state. ( all three regions are part of one single Table) Since HBASE is in consistent state , there is no issue in Hbase operation, but I am not able to run balancer since regions struck in Transition state. How to remove/move these regions out of transition. I tried below command before posting this question. hbase hbck

Why exported HBase table is 4 times bigger than its original?

阅读更多关于 Why exported HBase table is 4 times bigger than its original?

问题 I need to backup HBase table before update to a newer version. I decided to export table to hdfs with standard Export tool and then move it to local file system. For some reason exported table is 4 times larger than original one: hdfs dfs -du -h 1.4T backup-my-table hdfs dfs -du -h /hbase/data/default/ 417G my-table What can be the reason? Is it somehow related to compression? P.S. Maybe the way I made the backup matters. First I made a snapshot from target table, then cloned it to a copy

使用BulkLoad从HDFS批量导入数据到HBase

阅读更多关于使用BulkLoad从HDFS批量导入数据到HBase

在向Hbase中写入数据时，常见的写入方法有使用HBase API，Mapreduce批量导入数据，使用这些方式带入数据时，一条数据写入到HBase数据库中的大致流程如图。数据发出后首先写入到雨鞋日志WAl中，写入到预写日志中之后，随后写入到内存MemStore中，最后在Flush到Hfile中。这样写数据的方式不会导致数据的丢失，并且道正数据的有序性，但是当遇到大量的数据写入时，写入的速度就难以保证。所以，介绍一种性能更高的写入方式BulkLoad。使用BulkLoad批量写入数据主要分为两部分：一、使用HFileOutputFormat2通过自己编写的MapReduce作业将HFile写入到HDFS目录，由于写入到HBase中的数据是按照顺序排序的，HFileOutputFormat2中的configureIncrementalLoad()可以完成所需的配置。二、将Hfile从HDFS移动到HBase表中，大致过程如图实例代码pom依赖： <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-server</artifactId> <version>1.4.0</version> </dependency> <dependency> <groupId>org.apache.hadoop<

创建HBase表出现 "xxxxx is disabled."

阅读更多关于创建HBase表出现 "xxxxx is disabled."

用hbase shell 创建表的时候出现：“SearchCount is disabled” hbase ( main ) :002:0> count 'SearchCount' ERROR: org . apache . hadoop . hbase . DoNotRetryIOException: SearchCount is disabled . Here is some help for this command: Count the number of rows in a table . This operation may take a LONG time ( Run '$HADOOP_HOME/bin/hadoop jar hbase.jar rowcount' to run a counting mapreduce job ) . Current count is shown every 1000 rows by default . Count interval may be optionally specified . Scan caching is enabled on count scans by default . Default cache size is 10 rows . If your rows are small in size , you

HBase的HRegionServer进程无法正常启动（java.lang.RuntimeException: HRegionServer Aborted）

阅读更多关于 HBase的HRegionServer进程无法正常启动（java.lang.RuntimeException: HRegionServer Aborted）

错误描述： HBase集群启动后，从节点的HRegionServer无法正常启动错误发生原因：集群时间不同步解决步骤：启动时，查看异常发生节点的HBase的启动日志发现异常信息为：java.lang.RuntimeException: HRegionServer Aborted 集群时间不同步导致无异常节点：异常节点：同步集群时间三台节点执行以下命令 nptdate ntp4.aliyun.com 或：使用crontab定时任务将n台无法联网的服务器与一台可以联网的服务器同步重启HBase集群来源： CSDN 作者：辛Lay 链接： https://blog.csdn.net/weixin_38097878/article/details/103659664

importing data from sql server to hbase

阅读更多关于 importing data from sql server to hbase

问题 I know that Sqoop allows us to import data from a RDBMS into HDFS. I was wondering if the sql server connector in sqoop also allows us to import it directly into HBase? I know we can do this with mysql. I was wondering if the same can be done with sql server too 回答1: I am working in the Hortonworks Sandbox, and I was able to pull data from a SQL Server instance into an HBase table by doing the following steps: Get the SQL Server JDBC driver onto the Hadoop box. curl -L 'http://download