hbase

How to put values into hbase table through happybase?

旧巷老猫 提交于 2019-12-12 04:45:19
问题 My development environment is centos7, hbase 1.2.5, happybase 1.1.0, python 2.7, PyCharm, hadoop 2.7.3, spark 2.1 I am developing a big data software. I need put the values into HBase table. The values are from Spark RDD. The following are the codes: import happybase from pyspark import SparkContext, SparkConf connection = happybase.Connection('localhost') table = connection.table('tablename') conf = SparkConf().setAppName("myFirstSparkApp").setMaster("local") sc = SparkContext(conf=conf)

How do I get the number of HFiles of a Hbase table in Java?

喜夏-厌秋 提交于 2019-12-12 04:36:25
问题 I have a HBase table. I executed HBase major compaction for the table. How do I get the number of HFiles dynamically for the Hbase table in Java? 回答1: You don't mention which version of HBase you are using but you may be able to use Hannibal for that. 回答2: Found this code sample in Kylin - you can get the number of store files from a RegionLoad instance, i.e. int storeFiles = regionLoad.getStorefiles() /** Constructor for unit testing */ HBaseRegionSizeCalculator(HTable table, HBaseAdmin

Hbase - How to add a super column family?

前提是你 提交于 2019-12-12 03:59:59
问题 I am trying to create Java application that convert MYSQL database to NOSQL Hbase database . So far it read the data from mysql and insert it to hbase correctely But now i'am trying to handle relationship between tables of MYSQL, and i understand if there are relationship you should add one of table as super column family . I looked in apatch website documentation i couldn't find anything. Any ideas ? 回答1: Column family has nothing to do with relationship. In contrast you have to correctly

Accessing a kererized remote HBASE cluster from Spark

不问归期 提交于 2019-12-12 03:47:48
问题 I'm attempting to read data from a kerberized HBASE instance from Spark using the Hortonworks SPARK-ON-HBASE connector. My cluster configuration essentially looks like this: I am submitting my spark jobs from a client machine to a remote Spark standalone cluster, and that job is attempting to read data from a seperate HBASE cluster. If I bypass the standalone cluster by running Spark with master=local[*] directly on my client, I can access the remote HBASE cluster no problem as long as I

HBase mapreduce: write into HBase in Reducer

夙愿已清 提交于 2019-12-12 03:16:07
问题 I am learning the HBase. I know how to write a Java program using Hadoop MapReduce and write the output into HDFS; but now I want to write the same output into HBase, instead of HDFS. It should have some similar code like I did before in HDFS thing: context.write(key,value); Could anyone show me an example to achieve this? 回答1: Here's one way to do this: public static class MyMapper extends TableMapper<ImmutableBytesWritable, Put> { public void map(ImmutableBytesWritable row, Result value,

java.lang.ClassCastException: org.apache.hadoop.hbase.client.Result cannot be cast to org.apache.hadoop.hbase.client.Mutation

匆匆过客 提交于 2019-12-12 02:58:28
问题 getting error while transferring value from one hbase table to other INFO mapreduce.Job: Task Id : attempt_1410946588060_0019_r_000000_2, Status : FAILED Error: java.lang.ClassCastException: org.apache.hadoop.hbase.client.Result cannot be cast to org.apache.hadoop.hbase.client.Mutation at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:87) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:576) at org.apache

Are HBase Batch Operations Atomic?

僤鯓⒐⒋嵵緔 提交于 2019-12-12 02:49:35
问题 // Create a list of Puts, and save in HBase via HTable's batch method HTable table = new HTable(HBaseConfiguration.create(), 'table_name'); List<Row> actions = new ArrayList<Row>(); actions.add(put1); //rowkey = 'abc' actions.add(put2); //rowkey = 'def' actions.add(put3); //rowkey = 'ghi' Object[] results = table.batch(actions); Is it possible that this snippit could result in at least one, but not all, of the puts failing to to save to HBase? In other words, is batch guarenteed to happen

How to Create a Column family in a Selected Cluster in HBase

China☆狼群 提交于 2019-12-12 01:58:47
问题 In cassandra hector API allow to create table on a selected cluster as follows. I want to do the same thing using HBase, can someone please help me out? This is how it can done using Cassandra : public void createColumnFamily(Cluster cluster, String tableName, String columnFamilyName, StreamDefinition streamDefinition) { } 回答1: There is no such concept of cluster namespace in HBase. You can simple create a table in HBase using methods of HBaseAdmin class. HBaseConfiguration conf =

TSocket read 0 bytes - happybase version 0.8

核能气质少年 提交于 2019-12-12 01:15:30
问题 I'm trying to connect hbase by happybase framework version 0.8. I've started daemon thrift - /usr/hdp/current/hbase-master/bin/hbase-daemon.sh start thrift -p 9090 from happybase.connection import Connection DEFAULT_HOST = '10.128.121.13' DEFAULT_PORT = 9090 DEFAULT_TRANSPORT = 'framed' DEFAULT_COMPAT = '0.96'` cc = Connection(DEFAULT_HOST,DEFAULT_PORT,None,True,None,'_',DEFAULT_COMPAT,DEFAULT_TRANSPORT) print(cc.tables())` Do I need to start thrift service in all nodes, Hbase master and

Pig - exception on simple load

筅森魡賤 提交于 2019-12-12 00:22:56
问题 I just started learning pig and trying to do something with it, so I enter the pig console and simply type a = load 'sample_data.csv'; ( I have a file named sample_data.csv ). I received the following exception: Pig Stack Trace --------------- ERROR 2998: Unhandled internal error. name java.lang.NoSuchFieldError: name at org.apache.pig.parser.QueryParserStringStream.<init>(QueryParserStringStream.java:32) at org.apache.pig.parser.QueryParserDriver.tokenize(QueryParserDriver.java:207) at org