hbase

Column Value Range Filter in Hbase 0.94

我是研究僧i 提交于 2019-12-23 02:01:12
问题 I want to use a range filter in hbase on more than one column . I know we can use SingleColumnValueFilter implementing And/Or Conditions but I want to run the same filter condition against two different columns. Example:myhbase table rowkey,cf:bidprice,cf:askprice,cf:product I want to filter all the rows with (cf:bidprice>10 and cf:bidprice<20) or (cf:askprice>10 and cf:askprice<20) . 回答1: I think I figured it out. Below code snippet is an example implementation. byte[] startRow=Bytes.toBytes

[Hbase] Hbase优化之禁用wal以及Hfile应

两盒软妹~` 提交于 2019-12-23 01:30:45
1、WAL:write-ahead log 预写日志 灾难恢复,一旦服务器崩溃,通过重放log,即可恢复之前的数据( 内存中还没有刷写到磁盘的数据 );如果写入wal操作失败,整个操作就认为是失败。 因此,如果开启wal机制,写操作性能会降低; 如果关闭wal机制,数据在内存中未刷写到磁盘时,server突然宕机,产生数据丢失。 解决办法:不开启wal机制,手工刷新memstore的数据落地 //禁用WAL机制 put.setDurability(Durability.SKIP_WAL) 在数据写操作之后,调用flushTable操作,代替wal机制: def flushTable(table:String,conf:Configuration):Unit={ var connection: Connection = null var admin:Admin=null connection=ConnectionFactory.createConnection(conf) try{ admin=connection.getAdmin //将数据从MemStore刷写到磁盘中 admin.flush(TableName.valueOf(table)) }catch{ case e:Exception=>e.printStackTrace() }finally { if(null!

spark-submit no class found - htrace

末鹿安然 提交于 2019-12-23 01:03:23
问题 I am trying to run the Spark example code HBaseTest from command line using spark-submit instead run-example, in that case, I can learn more how to run spark code in general. However, it told me CLASS_NOT_FOUND about htrace since I am using CDH5.4. I successfully located the htrace jar file but I am having a hard time adding it to path. This is the final spark-submit command I have but still have the class not found error. Can anyone help me with this? #!/bin/bash export SPARK_HOME=/opt

My cdh5.2 cluster get FileNotFoundException when running hbase MR jobs

纵饮孤独 提交于 2019-12-22 22:25:09
问题 My cdh5.2 cluster has a problem to run hbase MR jobs. For example, I added the hbase classpath into the hadoop classpath: vi /etc/hadoop/conf/hadoop-env.sh add the line: export HADOOP_CLASSPATH="/usr/lib/hbase/bin/hbase classpath:$HADOOP_CLASSPATH" And when I am running: hadoop jar /usr/lib/hbase/hbase-server-0.98.6-cdh5.2.1.jar rowcounter "mytable" I get the following exception: 14/12/09 03:44:02 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java

HBase伪分布式安装及简单使用

元气小坏坏 提交于 2019-12-22 21:18:29
HBase是Hadoop的数据库,基于Hadoop执行。是一种NoSQL数据库。 特点:分布式、多版本号、面向列的存储模型。可以大规模的数据实时随机读写,可直接使用本地文件系统。 不适合:与关系型数据库相比。模型简单。API非常少;不适合小规模的数据。 数据存放的位置叫做单元(cell),当中的数据能够有多个版本号,依据时间戳(timestamp)来差别。 安装: tar xfz hbase-0.94.18.tar.gz cd hbase* cd conf vi hbase-env.sh export JAVA_HOME = /usr/jdk1.6.0_45 vi hbase-site.xml <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://localhost:9000/hbase</value> <description>数据存放的位置。</description> </property> <property> <name>dfs.replication</name> <value>1</value> <description>指定副本个数为1。由于伪分布式。</description> </property> </configuration> 以上配置完毕后。启动hadoop, cd ..

基于 Hadoop 集群的 zookeeper 与 Hbase 集群搭建

一世执手 提交于 2019-12-22 20:17:54
文章目录 一、软件版本与系统环境 二、zookeeper 安装 1、xftp上传zookeeper压缩包并解压 2、进入解压好的安装包 在conf文件夹下配置环境 3、配置环境变量 4、启动zookeeper 三、Hbase集群搭建 1、xftp上传Hbase压缩包并解压 2、环境变量配置 3、启动Hbase 一、软件版本与系统环境 ①HBase,1.2.0, 下载链接 ②ZooKeeper,3.4.5, 下载链接 二、zookeeper 安装 1、xftp上传zookeeper压缩包并解压 cd / opt / soft ls tar - zxvf zookeeper - 3.4 .5 - cdh5 . 14.2 . tar . gz mv zookeeper - 3.4 .5 - cdh5 . 14.2 zooker345 2、进入解压好的安装包 在conf文件夹下配置环境 cd zookeeper345 / cd conf ls cp zoo_sample . cfg zoo . cfg vi zoo . cfg 修改存储数据路径、添加zookeeper交换数据端口已经选举端口 dataDir = / opt / soft / zookeeper345 / data server . 1 = 192.168 .56 .122 : 2287 : 3387 添加存储文件夹 [

Error starting Hive: java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf

末鹿安然 提交于 2019-12-22 17:47:30
问题 I have downloaded latest stable release of Hive, when I start /usr/local/hive/bin/hive it gives me this error: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.util.RunJar.main(RunJar.java:149) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java

Migrate java code from hbase 0.92 to 0.98.0-hadoop2

会有一股神秘感。 提交于 2019-12-22 13:08:11
问题 I hava some code, wrote with hbase 0.92: /** * Writes the given scan into a Base64 encoded string. * * @param scan The scan to write out. * @return The scan saved in a Base64 encoded string. * @throws IOException When writing the scan fails. */ public static String convertScanToString(Scan scan) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); DataOutputStream dos = new DataOutputStream(out); scan.write(dos); return Base64.encodeBytes(out.toByteArray()); } /** *

Migrate java code from hbase 0.92 to 0.98.0-hadoop2

人盡茶涼 提交于 2019-12-22 13:08:02
问题 I hava some code, wrote with hbase 0.92: /** * Writes the given scan into a Base64 encoded string. * * @param scan The scan to write out. * @return The scan saved in a Base64 encoded string. * @throws IOException When writing the scan fails. */ public static String convertScanToString(Scan scan) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); DataOutputStream dos = new DataOutputStream(out); scan.write(dos); return Base64.encodeBytes(out.toByteArray()); } /** *

Spark serialization error: When I insert Spark Stream data into HBase

主宰稳场 提交于 2019-12-22 10:03:54
问题 I'm confused about how spark interact with HBase in terms of data format. For instance, when I omitted the 'ERROR' line in the following code snippet, it runs well... but adding the line, I've caught the error saying 'Task not serializable' related to serialization issue. How do I change the code? What is the reason why the error happens? My code is following : // HBase Configuration hconfig = HBaseConfiguration.create(); hconfig.set("hbase.zookeeper.property.clientPort", "2222"); hconfig.set