hbase | 易学教程

How can I suppress INFO logs in an HBase client application?

阅读更多关于 How can I suppress INFO logs in an HBase client application?

问题 I'm writing a Java console application that accesses HBase, and I can't figure out how to get rid of all the annoying INFO messages: 13/05/24 11:01:12 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT 13/05/24 11:01:12 INFO zookeeper.ZooKeeper: Client environment:host.name=10.1.0.110 13/05/24 11:01:12 INFO zookeeper.ZooKeeper: Client environment:java.version=1.7.0_15 13/05/24 11:01:12 INFO zookeeper.ZooKeeper: Client environment:java

Apache kylin 入门

阅读更多关于 Apache kylin 入门

本篇文章就概念、工作机制、数据备份、优势与不足4个方面详细介绍了Apache Kylin。 Apache Kylin 简介 1. Apache kylin 是一个开源的海量数据分布式预处理引擎。它通过 ANSI-SQL 接口，提供基于 hadoop 的超大数据集（TB-PB 级）的多维分析（OLAP）功能。 2. kylin 可实现超大数据集上的亚秒级（sub-second latency）查询。 1）确定 hadoop 上一个星型模式的数据集。 2）构建数据立方体 cube。 3）可通过 ODBC, JDBC,RESTful API 等接口在亚秒级的延迟内查询相 Apache Kylin 核心概念 1. 表（Table ）：表定义在 hive 中，是数据立方体（Data cube）的数据源，在 build cube 之前，必须同步在 kylin 中。 2. 模型（model）: 模型描述了一个星型模式的数据结构，它定义了一个事实表（Fact Table）和多个查找表（Lookup Table）的连接和过滤关系。 3. 立方体（Cube）：它定义了使用的模型、模型中的表的维度（dimension）、度量（measure , 一般指聚合函数，如：sum、count、average 等）、如何对段分区（ segments partition）、合并段（segments auto

Get output from scans in hbase shell

阅读更多关于 Get output from scans in hbase shell

问题 Is there any way I can output the results from a scan in the hbase shell to a file? I'm assuming this is easy but I haven't been able to find anything in the documentation. 回答1: I know that this post is quite old but i was searching something about HBase myself and came across with it. Well i don't know if this is the best way to do it, but you can definitely use the scripting option HBase gives you. Just open a shell (preferably go to the directory bin of HBase) and run echo "scan 'foo'" | .

How to connect to remote HBase in Java?

阅读更多关于 How to connect to remote HBase in Java?

问题 I have a standlone HBase server. This is my hbase-site.xml: <configuration> <property> <name>hbase.rootdir</name> <value>file:///hbase_data</value> </property> </configuration> I am trying to write a Java program to manipulate the data in the HBase. If I run the program on the HBase server, it works fine. But I don't know how to config it for remote access. Configuration config = HBaseConfiguration.create(); HTable table = new HTable(config, "test"); Scan s = new Scan(); I have tried adding

Hbase auto increment any column/row-key

阅读更多关于 Hbase auto increment any column/row-key

问题 I am new to Hbase is it possible to/how can I auto increment row-key in Hbase? (like for each insert row-key has to be auto increment itself) or is it possible to auto-increment any other column ? (like for each insert this column has to be auto-increment by 1) 回答1: Monolitically increasing row keys are not recommended in HBase, see this for reference: http://hbase.apache.org/book/rowkey.design.html, p.6.3.2. In fact, using globally ordered row keys would cause all instances of your

org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet

阅读更多关于 org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet

hbase 进行 javaAPI 操作时报了如下错误 : org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=32, exceptions: Tue Dec 17 10:21:47 CST 2019, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68473: row 'myUser,,' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop01,60020,1576455547324, seqNum=0 Caused by: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet 报错详情如下: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=32,

CDH4 Hbase using Pig ERROR 2998 java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/Filter

阅读更多关于 CDH4 Hbase using Pig ERROR 2998 java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/Filter

问题 I am using CDH4 in a pseudo-distributed mode and I have some trouble working with HBase and Pig together (but both work fine alone). I am following step by step this nice tutorial: http://blog.whitepages.com/2011/10/27/hbase-storage-and-pig/ So my Pig-script looks like this register /usr/lib/zookeeper/zookeeper-3.4.3-cdh4.1.2.jar register /usr/lib/hbase/hbase-0.92.1-cdh4.1.2-security.jar register /usr/lib/hbase/lib/guava-11.0.2.jar raw_data = LOAD 'input.csv' USING PigStorage( ',' ) AS (

HBase主要运行机制

阅读更多关于 HBase主要运行机制

HBase 的物理存储 HBase 表中的所有行都是按照行键的字典序排列的。因为一张表中包含的行的数量非常多，有时候会高达几亿行，所以需要分布存储到多台服务器上。因此，当一张表的行太多的时候，HBase 就会根据行键的值对表中的行进行分区，每个行区间构成一个“分区（Region）”，包含了位于某个值域区间内的所有数据，如图 1 所示。图 1 HBase的Region存储模式图 Region 是按大小分割的，每个表一开始只有二个 Region，随着数据不断插入到表中，Region 不断增大，当增大到一个阈值的时候，Region 就会等分为两个新的 Region。当表中的行不断增多时，就会有越来越多的 Region，如图 2 所示。图 2 HBase的Region分裂示意 Region 是 HBase 中数据分发和负载均衡的最小单元，默认大小是 100MB 到 200MB。不同的 Region 可以分布在不同的 Region Server 上，但一个 Region 不会拆分到多个 Region Server 上。每个 Region Server 负责管理一个 Region 集合。如图 3 所示。图 3 HBase的Region分布模式 Region 是 HBase 在 Region Server 上数据分发的最小单元，但并不是存储的最小单元。事实上，每个 Region

hbase的的基本操作

阅读更多关于 hbase的的基本操作

1. 创建表的时候必须要指定列族（student是表 info是列族） create 'student' , 'info' 2. 向表中加入数据(1001是行键) put 'student' , '1001' , 'info:name' , 'x' put 'student' , '1001' , 'info:sex' , 'male' put 'student' , '1001' , 'info:age' , 18 put 'student' , '1002' , 'info:name' , 'y' put 'student' , '1002' , 'info:sex' , 'fmale' put 'student' , '1003' , 'info:name' , 'z' put 'student' , '1003' , 'info:sex' , 'male' 3. 查看全表数据 scan 'student' 4. 查看指定范围的数据(左闭右开) sc 'student' , { STARTROW = > '1001' , STOPROW = > '1003' } 6. 查看指定表的数据 get 'student' , '1001' , 'info:name' 7. 删除1003有关的所有信息 deleteall 'student' , '1003' 8.

HBase常用Java API

阅读更多关于 HBase常用Java API

HBase 的常用Java API HBase 主要包括 5 大类操作：HBase 的配置、HBase 表的管理、列族的管理、列的管理、数据操作等。 1）org.apache.hadoop.hbase.HBaseConfiguration HBaseConfiguration 类用于管理 HBase 的配置信息，使用举例如下。 static Configuration cfg = HBaseConfiguration.create(); 2）org.apache.hadoop.hbase.client.Admin Admin 是 Java 接口类型，不能直接用该接口来实例化一个对象，而是必须通过调用 Connection.getAdmin() 方法，来调用返回子对象的成员方法。该接口用来管理 HBase 数据库的表信息。它提供的方法包括创建表，删除表，列出表项，使表有效或无效，以及添加或删除表列族成员等。创建表使用的例子如下。 Configuration configuration = HBaseConfiguration.create(); Connection connection = ConnectionFactory.createConnection(configuration); Admin admin = connection.getAdmin(); if(admin

订阅 hbase