hbase

HBase: How to write custom skip filter after 0.96.0?

↘锁芯ラ 提交于 2020-01-13 17:02:18
问题 I am new to HBase. I want to make a custom fuzzy filter in HBase but have been having great difficulty finding any resources to explain the proper way to do so in java. The only examples I've found seems to use a version of HBase in which FilterBase supplies different functions, as do all source codes for existing filters which I can find. (0.94.0 I think) More specifically, I found this code for FuzzyRowFilter which I would like to slightly modify. However, as seen here, functions like

what does 2n + 1 quorum mean?

旧街凉风 提交于 2020-01-13 10:22:16
问题 I've come across this when describing the Zookeeper configuration for HBase, and I'm unfamiliar with the term. Does the 'N' have anything to do with the number of nodes in my HBase cluster? Or the number of nodes I should use in my Zookeeper cluster? 回答1: 2f+1 refers to the level of reliability/availablility you require, in general it is not related to performance. ZooKeeper ensembles (serving clusters) are made up of one or more servers which "vote" on each change. A majority of the original

When using HBase as a source for MapReduce, can I extend TableInputFormatBase to create multiple splits and multiple mappers for each region?

淺唱寂寞╮ 提交于 2020-01-13 08:25:12
问题 I'm thinking about using HBase as a source for one of my MapReduce jobs. I know that TableInputFormat specifies one input split (and thus one mapper) per Region. However, this seems inefficient. I'd really like to have multiple mappers working on a given Region at once. Can I achieve this by extending TableInputFormatBase? Can you please point me to an example? Furthermore, is this even a good idea? Thanks for the help. 回答1: You need a custom input format that extends InputFormat. you can get

HBase配置&启动脚本分析

醉酒当歌 提交于 2020-01-13 03:01:52
本文档基于hbase-0.96.1.1-cdh5.0.2,对HBase配置&启动脚本进行分析 date:2016/8/4 author:wangxl HBase配置&启动脚本分析 剔除window相关脚本,我们主要分析配置文件与.sh文件 1 文件概览 conf ├── hadoop-metrics2-hbase.properties ├── hbase-env.sh ├── hbase-policy.xml ├── hbase-site.xml ├── log4j.properties └── regionservers bin ├── graceful_stop.sh ├── hbase ├── hbase-cleanup.sh ├── hbase-common.sh ├── hbase-config.sh ├── hbase-daemon.sh ├── hbase-daemons.sh ├── local-master-backup.sh ├── local-regionservers.sh ├── master-backup.sh ├── regionservers.sh ├── rolling-restart.sh ├── start-hbase.sh ├── stop-hbase.sh └── zookeepers.sh 2 分析 我们依据集群的构建步骤

Can we get all the column names from an HBase table?

好久不见. 提交于 2020-01-13 02:30:18
问题 Setup: I have an HBase table, with 100M+ rows and 1 Million+ columns. Every row has data for only 2 to 5 columns. There is in just 1 Column Family. Problem: I want to find out all the distinct qualifiers (columns) in this column family . Is there a quick way to do that? I can think of about scanning the whole table, then getting familyMap for each row, get qualifier and add it to a Set<> . But that would be awfully slow, as there are 100M+ rows. Can we do any better? 回答1: You can use a

Ubuntu16.04下HBase的安装与配置

我的未来我决定 提交于 2020-01-13 02:06:22
一、环境 os : Ubuntu 16.04 LTS 64bit jdk : 1.8.0_161 hadoop : 2.6.4 mysql : 5.7.21 hive : 2.1.0 hbase: 0.98.22-hadoop2 安装HBase前,系统要先安装 hadoop 和 hive . 二、安装步骤 1、安装hbase 下载 hbase-0.98.22-hadoop2-bin.tar.gz ,使用以下命令解压安装到/usr/local/目录下: ~/下载$ sudo tar -xzf hbase-0.98.22-hadoop2-bin.tar.gz -C /usr/local ~/下载$ cd /usr/local /usr/local$ sudo mv hbase-0.98.22-hadoop2/ hbase /usr/local$ sudo chown -R hadoop hbase/ #hadoop为我的用户名,更改为自己的用户名即可 编辑~/.bashrc,添加以下代码: export HBASE_HOME=/usr/local/hbase export HBASE_CONF_DIR=$HBASE_HOME/conf export PATH=$PATH:$HBASE_HOME/bin 然后执行 source ~/.bashrc 使环境变量生效。使用 hbase

Hive与HBase的兼容配置

♀尐吖头ヾ 提交于 2020-01-13 02:04:59
Hive与HBase的兼容配置 1、创建软连接 ln -s $HBASE_HOME/lib/hbase-common-1.4.8.jar $HIVE_HOME/lib/hbase-common-1.4.8.jarln -s $HBASE_HOME/lib/hbase-server-1.4.8.jar $HIVE_HOME/lib/hbase-server-1.4.8.jarln -s $HBASE_HOME/lib/hbase-client-1.4.8.jar $HIVE_HOME/lib/hbase-client-1.4.8.jarln -s $HBASE_HOME/lib/hbase-protocol-1.4.8.jar $HIVE_HOME/lib/hbase-protocol-1.4.8.jarln -s $HBASE_HOME/lib/hbase-it-1.4.8.jar $HIVE_HOME/lib/hbase-it-1.4.8.jarln -s $HBASE_HOME/lib/htrace-core-3.1.0-incubating.jar $HIVE_HOME/lib/htrace-core-3.1.0-incubating.jarln -s $HBASE_HOME/lib/hbase-hadoop2-compat-1.4.8.jar $HIVE_HOME/lib

Efficient way to delete multiple rows in HBase

爷,独闯天下 提交于 2020-01-12 17:23:26
问题 Is there an efficient way to delete multiple rows in HBase or does my use case smell like not suitable for HBase? There is a table say 'chart', which contains items that are in charts. Row keys are in the following format: chart|date_reversed|ranked_attribute_value_reversed|content_id Sometimes I want to regenerate chart for a given date, so I want to delete all rows starting from 'chart|date_reversed_1' till 'chart|date_reversed_2'. Is there a better way than to issue a Delete for each row

HBase : get(…) vs scan and in-memory table

徘徊边缘 提交于 2020-01-12 15:31:26
问题 I'm executing MR over HBase. The business logic in the reducer heavily accesses two tables, say T1(40k rows) and T2(90k rows). Currently, I'm executing the following steps : 1.In the constructor of the reducer class, doing something like this : HBaseCRUD hbaseCRUD = new HBaseCRUD(); HTableInterface t1= hbaseCRUD.getTable("T1", "CF1", null, "C1", "C2"); HTableInterface t2= hbaseCRUD.getTable("T2", "CF1", null, "C1", "C2"); In the reduce(...) String lowercase = ....; /* Start : HBase code */ /*

Table is neither enables nor disabled in HBase

两盒软妹~` 提交于 2020-01-12 14:53:10
问题 I am facing a weird problem. I was accessing my HBase tables through an API. Midway during execution I got a RegionNotServing for my table 'x'. But My HRegionServers were working fine. When I tried to list the tables from HBase Shell I could not find my table 'x'. When I tried to Disable my table 'x' it threw a TableNotEnabledException and when I tried to Enable my table 'x' it threw me a TableNotDisabledException. Attached is the Exeption I got: hbase(main):002:0> disable 'x' ERROR: org