hbase

Apache phoenix: create phoenix table maps to existing hbase table

孤人 提交于 2021-01-29 17:12:43
问题 I do have existing hbase table & for SQL support I am trying to explore if I can create apache phoenix table. I would like to know if I create phoenix table on existing hbase table, does it replicate (or copy) data present in hbase table, or phoenix table just links to existing data present in hbase? My phoenix version is <4.12.0 so this error still applies on my version & hence can't create View on top of existing hbase table. 回答1: We can create a Phoenix table on top of the existing Hbase

Spark HBase/BigTable - Wide/sparse dataframe persistence

不羁的心 提交于 2021-01-28 08:03:36
问题 I want to persist to BigTable a very wide Spark Dataframe (>100'000 columns) that is sparsely populated (>99% of values are null) while keeping only non-null values (to avoid storage cost). Is there a way to specify in Spark to ignore nulls when writing? Thanks ! 来源: https://stackoverflow.com/questions/65647574/spark-hbase-bigtable-wide-sparse-dataframe-persistence

set HBase properties for Spark Job using spark-submit

一曲冷凌霜 提交于 2021-01-28 05:26:16
问题 During Hbase data migration I have encountered a java.lang.IllegalArgumentException: KeyValue size too large In long term : I need to increase the properties hbase.client.keyvalue.maxsize (from 1048576 to 10485760) in the /etc/hbase/conf/hbase-site.xml but I can't change this file now (I need validation). In short term : I have success to import data using command : hbase org.apache.hadoop.hbase.mapreduce.Import \ -Dhbase.client.keyvalue.maxsize=10485760 \ myTable \ myBackupFile Now I need to

Hbase batch get and SocketTimeoutException

…衆ロ難τιáo~ 提交于 2021-01-27 13:24:19
问题 I use java and i want to do batch get like this final List<Get> gets = uids.stream() .map(uid -> new Get(toBytes(uid))) .collect(Collectors.toList()); Configuration configuration = HBaseConfiguration.create(); conf.set("hbase.zookeeper.quorum", quorum); conf.set("hbase.zookeeper.property.clientPort", properties.getString("HBASE_CONFIGURATION_ZOOKEEPER_CLIENTPORT")); conf.set("zookeeper.znode.parent", properties.getString("HBASE_CONFIGURATION_ZOOKEEPER_ZNODE_PARENT")); HTable table = new

Spark-HBase - GCP template (2/3) - Version issue of json4s?

♀尐吖头ヾ 提交于 2021-01-20 07:27:37
问题 I'm trying to test the Spark-HBase connector in the GCP context and tried to follow 1, which asks to locally package the connector [2] using Maven (I tried Maven 3.6.3) for Spark 2.4, and get following error when submitting the job on Dataproc (after having completed [3]). Any idea ? Thanks for your support References 1 https://github.com/GoogleCloudPlatform/cloud-bigtable-examples/tree/master/scala/bigtable-shc [2] https://github.com/hortonworks-spark/shc/tree/branch-2.4 [3] Spark-HBase -

Spark-HBase - GCP template (3/3) - Missing libraries?

不羁的心 提交于 2021-01-15 19:44:42
问题 I'm trying to test the Spark-HBase connector in the GCP context and tried to follow the instructions, which asks to locally package the connector, and I get the following error when submitting the job on Dataproc (after having completed these steps). Command (base) gcloud dataproc jobs submit spark --cluster $SPARK_CLUSTER --class com.example.bigtable.spark.shc.BigtableSource --jars target/scala-2.11/cloud-bigtable-dataproc-spark-shc-assembly-0.1.jar --region us-east1 -- $BIGTABLE_TABLE Error

Spark-HBase - GCP template (3/3) - Missing libraries?

99封情书 提交于 2021-01-15 19:41:48
问题 I'm trying to test the Spark-HBase connector in the GCP context and tried to follow the instructions, which asks to locally package the connector, and I get the following error when submitting the job on Dataproc (after having completed these steps). Command (base) gcloud dataproc jobs submit spark --cluster $SPARK_CLUSTER --class com.example.bigtable.spark.shc.BigtableSource --jars target/scala-2.11/cloud-bigtable-dataproc-spark-shc-assembly-0.1.jar --region us-east1 -- $BIGTABLE_TABLE Error

Spark-HBase - GCP template (3/3) - Missing libraries?

左心房为你撑大大i 提交于 2021-01-15 19:38:15
问题 I'm trying to test the Spark-HBase connector in the GCP context and tried to follow the instructions, which asks to locally package the connector, and I get the following error when submitting the job on Dataproc (after having completed these steps). Command (base) gcloud dataproc jobs submit spark --cluster $SPARK_CLUSTER --class com.example.bigtable.spark.shc.BigtableSource --jars target/scala-2.11/cloud-bigtable-dataproc-spark-shc-assembly-0.1.jar --region us-east1 -- $BIGTABLE_TABLE Error

Spark-HBase - GCP template (3/3) - Missing libraries?

跟風遠走 提交于 2021-01-15 19:36:07
问题 I'm trying to test the Spark-HBase connector in the GCP context and tried to follow the instructions, which asks to locally package the connector, and I get the following error when submitting the job on Dataproc (after having completed these steps). Command (base) gcloud dataproc jobs submit spark --cluster $SPARK_CLUSTER --class com.example.bigtable.spark.shc.BigtableSource --jars target/scala-2.11/cloud-bigtable-dataproc-spark-shc-assembly-0.1.jar --region us-east1 -- $BIGTABLE_TABLE Error

Spark-HBase - GCP template (3/3) - Missing libraries?

☆樱花仙子☆ 提交于 2021-01-15 19:36:06
问题 I'm trying to test the Spark-HBase connector in the GCP context and tried to follow the instructions, which asks to locally package the connector, and I get the following error when submitting the job on Dataproc (after having completed these steps). Command (base) gcloud dataproc jobs submit spark --cluster $SPARK_CLUSTER --class com.example.bigtable.spark.shc.BigtableSource --jars target/scala-2.11/cloud-bigtable-dataproc-spark-shc-assembly-0.1.jar --region us-east1 -- $BIGTABLE_TABLE Error