hbase

HBase storing data for a particular column with 2 or more values for the same row-key in Scala/Java API

主宰稳场 提交于 2019-12-11 07:29:45
问题 I have a file with following contents: UserID Email 1001 abc@yahoo.com 1001 def@gmail.com 1002 gft@gmail.com 1002 rtf@yahoo.com I want to store the data like this: ROW COLUMN+CELL 1001 column=cf:Email, timestamp=1487917201278, value=abc@yahoo.com 1001 column=cf:Email, timestamp=1487917201279, value=def@gmail.com 1002 column=cf:Email, timestamp=1487917201286, value=gft@gmail.com 1002 column=cf:Email, timestamp=1487917201287, value=rtf@yahoo.com I am using Put for example: put 'table', '1001',

Hbase Stargate returns scrambled values

元气小坏坏 提交于 2019-12-11 07:18:39
问题 I'm trying out Hbase Stargate as a REST server that's bundled with my Hbase installation. It's simple to get up and running, but I'm wondering how to view actual row data? When I perform a GET request in my REST client, I am returned with scrambled values: GET localhost:8282/article/row1/ <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <CellSet> <Row key="cm93MQ=="> <Cell column="Y2Y6QXJ0aWNsZUlE" timestamp="1357592601561">MQ==</Cell> <Cell column="Y2Y6Q2FwRGF0ZQ==" timestamp=

Overriding TableMapper splits

我的未来我决定 提交于 2019-12-11 06:58:30
问题 I am using the following code to read from a table which has its row keys having a format of "epoch_meter" where epoch is the long representation of the date time in seconds and meter is a meter number. Job jobCalcDFT = Job.getInstance(confCalcIndDeviation); jobCalcDFT.setJarByClass(CalculateIndividualDeviation.class); Scan scan = new Scan(Bytes.toBytes(String.valueOf(startSeconds) + "_"), Bytes.toBytes(String.valueOf(endSeconds + 1) + "_")); scan.setCaching(500); scan.setCacheBlocks(false);

Read HBase table with where clause using Spark

ε祈祈猫儿з 提交于 2019-12-11 06:49:54
问题 I am trying to read a HBase table using Spark Scala API. Sample Code: conf.set("hbase.master", "localhost:60000") conf.set("hbase.zookeeper.quorum", "localhost") conf.set(TableInputFormat.INPUT_TABLE, tableName) val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], classOf[ImmutableBytesWritable], classOf[Result]) println("Number of Records found : " + hBaseRDD.count()) How to add where clause if i use newAPIHadoopRDD ? Or we need to use any Spark Hbase Connector to achieve this?

Writing to multiple HBASE Tables, how do I use context.write(hkey, put)?

て烟熏妆下的殇ゞ 提交于 2019-12-11 06:46:41
问题 I am new to Hadoop MapReduce. I would like to perform multiple tables writes from my reducer function. Which will be something like, if anything is getting written to Table1 then I want the same content in table 2 also. I have gone through the posts like Write to multiple tables in HBASE and checked the "MultiTableOutputFormat". But what I don't understand there is that according to the post in reducer function I should just use context.write(new ImmutableBytesWritable(Bytes.toBytes(

hbase delete records based on portion of id

霸气de小男生 提交于 2019-12-11 06:24:21
问题 I would like to know if it is possible to delete records from a table based on the id of the row. For example I created a table named 'hbase_test' with the family 'cmmnttest' and column 'cmmntpost' with the ids created as follows: '99.abcdefghijkil' '99.oiuerwrerwwre' I need to find all rows that have id starting with '99' and delete them. This is a combination of a client id '99' and the value of the record. I found the following but not sure if it applies here: To delete a cell from ‘t1′ at

Cloudera CDH 5.7.2 / HBase: How to Set hfile.format.version?

假如想象 提交于 2019-12-11 06:21:09
问题 With CDH 5.7.2-1.cdh5.7.2.po.18, I am trying to use Cloudera Manager to configure HBase to use visibility labels and authorizations, as described in the Cloudera Community post below: Cloudera Manager Hbase Visibility Labels Using Cloudera Manager, I have successfully updated the values of the following properties: hbase.coprocessor.region.classes: Set to org.apache.hadoop.hbase.security.visibility.VisibilityController hbase.coprocessor.master.classes: Set to org.apache.hadoop.hbase.security

Phoenix udf not working

烂漫一生 提交于 2019-12-11 06:17:25
问题 I am trying to run a custom udf in apache phoenix but getting error. Please help me to figure out the issue. Following is my function class: package co.abc.phoenix.customudfs; import org.apache.hadoop.hbase.io.ImmutableBytesWritable; import org.apache.phoenix.expression.Expression; import org.apache.phoenix.expression.function.ScalarFunction; import org.apache.phoenix.parse.FunctionParseNode.Argument; import org.apache.phoenix.parse.FunctionParseNode.BuiltInFunction; import org.apache.phoenix

Output directory not set exception when save RDD to hbase with spark

最后都变了- 提交于 2019-12-11 06:07:33
问题 I have a job to retrieve data from hbase with spark as rdd and do a filter then save it back to the base as sample data like this: object FilterData { def main(args: Array[String]) { filterData() } def filterData() = { val sparkConf = new SparkConf().setAppName("filterData").setMaster("spark://spark:7077") val sc = new SparkContext(sparkConf) val conf = HBaseConfiguration.create() conf.set("hbase.zookeeper.quorum", "172.16.1.10,172.16.1.11,172.16.1.12") conf.setInt("timeout", 120000) conf.set

Could not find an appropriate constructor for com.google.cloud.bigtable.hbase1_x.BigtableConnection

心不动则不痛 提交于 2019-12-11 04:47:23
问题 I tried to create a simple class to create a table and add some columns with HBase and Google app engine. I have already created a project and an instance in Google Cloud platform. I cloned this repository : https://github.com/GoogleCloudPlatform/cloud-bigtable-examples/blob/master/java/hello-world/src/main/java/com/example/cloud/bigtable/helloworld/HelloWorld.java It works like a charm to create a table in my instance. But when I'm trying to create a new maven project with the same config,