hbase | 易学教程

hbase（shell）实操

阅读更多关于 hbase（shell）实操

1：开启hadoop集群 2：开启hbase集群 3： hbase shell 使用的学习方法 1: help查看hbase支持的命令，输入help ‘ xxxxx’，可以查看到该命令的使用方法！ 2：按下tab键可以对命令进行补全 3：去官网学习，不用去买任何书，官网是最强的。http://hbase.apache.org 开始学习了 COMMAND GROUPS: Group name: general Commands: status, table_help, version, whoami Group name: ddl Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters Group name: namespace Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables Group

Cell versioning with Cassandra

阅读更多关于 Cell versioning with Cassandra

问题 My application uses an AbstractFactory for the DAO layer so once the HBase DAO family has been implemented, It would be very great for me to create the Cassandra DAO family and see the differences from several points of view. Anyway, trying to do that, I saw Cassandra doesn't support cell versioning like HBase (and my application makes a strong usage of that) so I was wondering if there are some table design trick (or something else) to "emulate" this behaviour in Cassandra 回答1: One common

Cell versioning with Cassandra

阅读更多关于 Cell versioning with Cassandra

HBase Mapreduce Dependency Issue when using TableMapper

阅读更多关于 HBase Mapreduce Dependency Issue when using TableMapper

问题 I am using CDH5.3 and I am trying to write a mapreduce program to scan a table and do some proccessing. I have created a mapper which extends TableMapper and exception that i am getting is : java.io.FileNotFoundException: File does not exist: hdfs://localhost:54310/usr/local/hadoop-2.5-cdh-3.0/share/hadoop/common/lib/protobuf-java-2.5.0.jar at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1093) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall

HBase Mapreduce Dependency Issue when using TableMapper

阅读更多关于 HBase Mapreduce Dependency Issue when using TableMapper

How to filter out rows with given column(not null)?

阅读更多关于 How to filter out rows with given column(not null)?

问题 I want to do a hbase scan with filters. For example, my table has column family A,B,C, and A has a column X. Some rows have the column X and some do not. How can I implement the filter to filter out all the rows with column X? 回答1: I guess you are looking for SingleColumnValueFilter in HBase. As mentioned in the API To prevent the entire row from being emitted if the column is not found on a row, use setFilterIfMissing(boolean) on Filter object. Otherwise, if the column is found, the entire

Phoenix链接hbase尝试

阅读更多关于 Phoenix链接hbase尝试

一、什么是Phonenix? Phoenix是构建在HBase上的一个SQL层，能让我们用标准的JDBC APIs而不是HBase客户端APIs来创建表，插入数据和对HBase数据进行查询。 Phoenix完全使用Java编写，作为HBase内嵌的JDBC驱动。Phoenix查询引擎会将SQL查询转换为一个或多个HBase扫描，并编排执行以生成标准的JDBC结果集。直接使用HBase API、协同处理器与自定义过滤器，对于简单查询来说，其性能量级是毫秒，对于百万级别的行数来说，其性能量级是秒。 Phoenix通过以下方式使我们可以少写代码，并且性能比我们自己写代码更好：将SQL编译成原生的HBase scans。确定scan关键字的最佳开始和结束让scan并行执行基本准备条件： hbase 版本1.2.6，对应的phoenix版本为4.14.1 下载apache-phoenix-4.14.1-HBase-1.2-bin，解压后可得到接着就是将拷贝到hbase的lib目录下；然后将hbase是hbase-site.xml拷贝到phoenix下的bin目录下；启动zookeeper 启动Hadoop 启动hbase 然后进入到Phoenix的bin目录下；输入 ./sqlline.py master:2181（zookeeper的主节点）接着可以看到一些语句；（1

Flume,Sqoop学习以及应用

阅读更多关于 Flume,Sqoop学习以及应用

目录 1.Flume是什么? 2.Flume如何搭建 3.Flume应用 4.Sqoop是什么？ 5.使用Sqoop将HBase数据计算并导入MySql 学习文档参考： http://flume.apache.org/releases/content/1.9.0/FlumeUserGuide.html 1.Flume是什么? Flume简单概括就是一个收集日志的工具，它可以通过调用接口，RPC，还有网页的一些操作进行日志的收集。它是一个分布式开源的Java编写的由Apache维护的项目。 2.Flume如何搭建搭建前提条件 2.1下载并解压到指定目录崇尚授人以渔的思想，我说给大家怎么下载就行了，就不直接放连接了，大家可以直接输入官网地址 http://flume.apache.org ，一般在官网的上方或者左边都会有Download按钮，这个在左侧，然后点进去下载想要的版本即可。这个会有点慢，如果嫌弃的化，可以通过相关镜像网站进行下载，可以百度搜索软件镜像，就能搜到很多镜像网站，在里面就可以下载，如果你下载的东西属于Apache旗下的，可以看的有专门的一个Apache目录，里面存的都是Apache旗下相关产品。可以先本地下载，然后通过ftp上传，也可以直接在服务器下载。我这里下载好后，解压到了服务器/opt 目录下面，并修改了下目录名称为flume(你也可以不改

Hbase数据库

阅读更多关于 Hbase数据库

一、HBase 是什么　　 Apache HBase is the Hadoop database,distributed scalable,versioned, non-relational database modeled after Google's Bigtable. (Apache HBase是 Hadoop 数据库，是模仿 Google 的 Bigtable 建模的分布式可伸缩，版本化，非关系型数据库）。　　可以随机的、实时的进行大数据的读写。　　十几亿行、上百万列，可以运行在普通机器上。　　数据可以存储在 HDFS 。　　面向列的。二、数据存储现状　　 1、RDBMS (Relational Database Management System) 　　　　* MySQL、Oracle、SQL Server ... ... 　　　　 * 有类型且结构化数据　　　　 * 实体类对应着数据库中的表　　　　 * 每一条记录对应着数据库表中的行(row ) 　　　　 * Query：groupBy、Join ... ... 　　　　缺点：大数据，数据量大且会对其做很多操作来抽取出我们有意义的结果。　　　　　　（1）实时查询　　　　　　（2）集群成本高　　　　　　（3） RDBMS 横向扩展来增加的效率是有限的　　　　　　（4）

CentOS分布式部署HBase

阅读更多关于 CentOS分布式部署HBase

继上篇《 CentOS分布式部署Hadoop 》介绍分布式部署Hadoop2.8.5，本篇在上篇基础上介绍CentOS7下HBase2.2.3的分布式部署。一、准备工作部署好Hadoop2.8.5，节点如下： 192.168.23.211 hadoop.master NameNode,DataNode,ResourceManager,NodeManager 192.168.23.212 hadoop.slaver1 SecondaryNameNode,DataNode,NodeManager 192.168.23.213 hadoop.slaver2 DataNode,NodeManager HBase部署节点计划如下： 192.168.23.211 hadoop.master Zookeeper，HMaster(主)，HRegionServer 192.168.23.212 hadoop.slaver1 Zookeeper，HRegionServer 192.168.23.213 hadoop.slaver2 Zookeeper，HMaster(备)，HRegionServer 二、分布式部署Zookeeper HBase可以使用内置的Zookeeper，也可以使用独立部署的Zookeeper，此处使用独立部署Zookeeper方案。下载稳定版apache-zookeeper

订阅 hbase