hbase | 易学教程

Spark connection to Hbase in Kerberos environement failing

阅读更多关于 Spark connection to Hbase in Kerberos environement failing

问题 I am using Spark 1.6.0( spark-1.2.0-cdh5.10.2 ) cloudera vm ( spark-1.2.0-cdh5.10.2 ) Hbase (1.2.0 from cloudera) Scala 2.10 Kerberos enabled The steps I am running are: kinit (So that my user will be in place) 2. spark-shell --master yarn --executor-memory 256m --jars /opt/cloudera/parcels/CDH/lib/hbase/lib/hbase-spark-1.2.0-cdh5.10.2.jar 3. ``` import org.apache.hadoop.hbase.spark.HBaseContext import org.apache.spark.SparkContext import org.apache.hadoop.hbase.{ CellUtil, TableName,

Continuous data migration from mysql to Hbase

阅读更多关于 Continuous data migration from mysql to Hbase

问题 I have installed hadoop and hbase for real time analytics purpose. The proble I face is to migrate data on line from mysql to Hbase. The sqoop tool is useful to do bulk data migrations, is there any way that the data from mysql can be transfered to HBase on line (then and there when an insert/update/delete happens). So that real time analytics can be achieved. Not near-real-time. Please help me on this regards. 回答1: I think you faced task of setting up replication between different DBMS. It

How Hive stores the data (loaded from HDFS)?

阅读更多关于 How Hive stores the data (loaded from HDFS)?

问题 I am fairly new to Hadoop (HDFS and Hbase) and Hadoop Eco system (Hive, Pig, Impala etc.). I have got a good understanding of Hadoop components such as NamedNode, DataNode, Job Tracker, Task Tracker and how they work in tandem to store the data in efficient manner. While trying to understand fundamentals of data access layer such as Hive, I need to understand where exactly a table’s data (created in Hive) gets stored? We can create external and internal table in Hive. As external tables can

Hbase的安装

阅读更多关于 Hbase的安装

实验环境 Linux Ubuntu 14.04 jdk-7u75-linux-x64 hadoop-2.6.0-cdh5.4.5 实验内容在已安装好的Hadoop环境基础上，安装并配置HBase。实验步骤 1.首先在Linux本地，新建/data/hbase1目录，用于存放所需文件。 mkdir -p /data/hbase1 切换目录到/data/hbase1下，使用wget命令，下载HBase所需安装包hbase-1.0.0-cdh5.4.5.tar.gz。 cd /data/hbase1 wget http: //192.168.1.100:60000/allfiles/hbase1/hbase-1.0.0-cdh5.4.5.tar.gz 2.将/data/hbase1目录下，HBase的安装包hbase-1.0.0-cdh5.4.5.tar.gz，解压缩到/apps目录下。 tar -xzvf /data/hbase1/hbase-1.0.0-cdh5.4.5.tar.gz -C /apps 再切换到/apps目录下，将/apps/hbase-1.0.0-cdh5.4.5/，重命名为hbase。 cd /apps mv /apps/hbase-1.0.0-cdh5.4.5/ /apps/hbase 3.添加HBase的环境变量。首先使用vim打开用户环境变量文件。

Not able to connect to remote Hbase

阅读更多关于 Not able to connect to remote Hbase

问题 i have a Hbase installation in distributed mode. The database is working fine and I am able to connect to the database if my webapp(spring + datanucleus JDO) is deployed on the same machine as the Hbase master. But if I run the same webapp on a different machine I am not able to connect to the HBase server. There are no exceptions at all and the webapp just stalls and after a few minutes times out. My config files are as follows: hbase-site.xml -> <configuration> <property> <name>hbase

hbase 完全分布式

阅读更多关于 hbase 完全分布式

0.安装hadoop和jdk 1.官网下载hbase安装包 2.解压到/soft/下,建立伪链接 ln -s hbase-xxx hbase 3.配置环境变量 vi /etc/environment HBASE_HOME=/soft/hbase path=.....:/soft/hbase 4.配置/soft/hbase/conf/hbase-site.xml //配置hbase master主目录 <property> <name>hbase.rootdir</name> <value>hdfs://s101:8020/hbase</value> </property> //是否已集群模式运行 <property > <name>hbase.cluster.distributed</name> <value>true</value> </property> //配置zookeeper目录 <property> <name>hbase.zookeeper.property.datadir</name> <value>/home/hadoop/hive/master/zk</value> </property> //zookeeper集群机器 <property> <name>hbase.zookeeper.quorum</name> <value>s102,s103,s104<

HBase的java代码开发（表中查询数据）

阅读更多关于 HBase的java代码开发（表中查询数据）

第一步：创建maven工程，导入jar包 <repositories> <repository> <id>cloudera</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url> </repository> </repositories> <dependencies> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.6.0-mr1-cdh5.14.0</version> </dependency> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-client</artifactId> <version>1.2.0-cdh5.14.0</version> </dependency> <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-server</artifactId> <version>1.2.0-cdh5.14.0</version> </dependency>

Hbase setup configuration: HMaster is not running

阅读更多关于 Hbase setup configuration: HMaster is not running

问题 I am trying to setup HBase in a fully distributed mode: consisting of 1 master and 2 region servers. I have set HBASE_MANAGES_ZK = true in hbase-env.sh. The hadoop cluster is running on the cluster with following configurations: Master: node-master Regionserver1: node1 Regionserver2: node2 When I am starting HBase, I can see that RegionServers are getting started and HQuorumPeer on master also, but HMaster is not showing. Please find the logs as below: Master hbase-site.xml <configuration>

Hbase commands not working in script executed via crontab

阅读更多关于 Hbase commands not working in script executed via crontab

问题 I was trying to list the set of tables present in the hbase using the below script: #!/bin/bash /home/user/hbase-1.2.4/bin/hbase shell << eof > /home/user/myfile.txt list 'RAW_5_.*' eof I am able to get the table list while i run the script in the bash terminal using :: sh script.sh , but its creating a 0kb file while running using the crontab. I have given the absolute path for the hbase. Can anyone help on this bottleneck please? 回答1: Since it is executing properly from terminal and not in

Exception while doing hbase scan

阅读更多关于 Exception while doing hbase scan

问题 I was trying out hbase spark distributed scan example. My simple code looks like this: public class DistributedHBaseScanToRddDemo { public static void main(String[] args) { JavaSparkContext jsc = getJavaSparkContext("hbasetable1"); Configuration hbaseConf = getHbaseConf(0, "", ""); JavaHBaseContext javaHbaseContext = new JavaHBaseContext(jsc, hbaseConf); Scan scan = new Scan(); scan.setCaching(100); JavaRDD<Tuple2<ImmutableBytesWritable, Result>> javaRdd = javaHbaseContext.hbaseRDD(TableName