google-cloud-bigtable

Spark-HBase - GCP template (1/3) - How to locally package the Hortonworks connector?

此生再无相见时 提交于 2021-02-17 06:30:36
问题 I'm trying to test the Spark-HBase connector in the GCP context and tried to follow [1], which asks to locally package the connector [2] using Maven (I tried Maven 3.6.3) for Spark 2.4, and leads to following issue. Error "branch-2.4": [ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile (scala-compile-first) on project shc-core: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed.: NullPointerException -> [Help 1]

Connect from Java app to Google Cloud Bigtable which running on Docker

送分小仙女□ 提交于 2021-02-07 10:18:03
问题 I want to connect to Google Cloud Bigtable which running on Docker: docker run --rm -it -p 8086:8086 -v ~/.config/:/root/.config \ bigtruedata/gcloud-bigtable-emulator It starts without any problems: [bigtable] Cloud Bigtable emulator running on 127.0.0.1:8086 ~/.config it is my default credentials that I configured in this way: gcloud auth application-default login I used Java-code from official sample HelloWorld. Also, I changed connection configuration like this: Configuration conf =

Connect from Java app to Google Cloud Bigtable which running on Docker

断了今生、忘了曾经 提交于 2021-02-07 10:17:44
问题 I want to connect to Google Cloud Bigtable which running on Docker: docker run --rm -it -p 8086:8086 -v ~/.config/:/root/.config \ bigtruedata/gcloud-bigtable-emulator It starts without any problems: [bigtable] Cloud Bigtable emulator running on 127.0.0.1:8086 ~/.config it is my default credentials that I configured in this way: gcloud auth application-default login I used Java-code from official sample HelloWorld. Also, I changed connection configuration like this: Configuration conf =

Connect from Java app to Google Cloud Bigtable which running on Docker

巧了我就是萌 提交于 2021-02-07 10:17:28
问题 I want to connect to Google Cloud Bigtable which running on Docker: docker run --rm -it -p 8086:8086 -v ~/.config/:/root/.config \ bigtruedata/gcloud-bigtable-emulator It starts without any problems: [bigtable] Cloud Bigtable emulator running on 127.0.0.1:8086 ~/.config it is my default credentials that I configured in this way: gcloud auth application-default login I used Java-code from official sample HelloWorld. Also, I changed connection configuration like this: Configuration conf =

timeseries data schema design for google bigtable or any google offering

删除回忆录丶 提交于 2021-01-29 14:34:00
问题 I am working on a project wherein I have to store events related to user activity per user on a daily basis for later analysis. I will be getting stream of timestamped events and later on will run dataflow jobs on this data for analytics to get stats per user. I am exploring big table to store this data, wherein timestamp will act as a key for each row, later I will run a range query to get single day data and process it. But after going through couple of resources figured that with

Facing OutOfMemoryException while exporting bigtable tables to google cloud storage

半世苍凉 提交于 2021-01-29 11:27:05
问题 I am exporting a table in Cloud Bigtable to Cloud Storage by following this link https://cloud.google.com/bigtable/docs/exporting-sequence-files#exporting_sequence_files_2 The bigtable table size is ~300GB and the dataflow pipeline results in this error An OutOfMemoryException occurred. Consider specifying higher memory instances in PipelineOptions. java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3236) at java.io.ByteArrayOutputStream.grow

Spark HBase/BigTable - Wide/sparse dataframe persistence

不羁的心 提交于 2021-01-28 08:03:36
问题 I want to persist to BigTable a very wide Spark Dataframe (>100'000 columns) that is sparsely populated (>99% of values are null) while keeping only non-null values (to avoid storage cost). Is there a way to specify in Spark to ignore nulls when writing? Thanks ! 来源: https://stackoverflow.com/questions/65647574/spark-hbase-bigtable-wide-sparse-dataframe-persistence

Spark-HBase - GCP template (2/3) - Version issue of json4s?

♀尐吖头ヾ 提交于 2021-01-20 07:27:37
问题 I'm trying to test the Spark-HBase connector in the GCP context and tried to follow 1, which asks to locally package the connector [2] using Maven (I tried Maven 3.6.3) for Spark 2.4, and get following error when submitting the job on Dataproc (after having completed [3]). Any idea ? Thanks for your support References 1 https://github.com/GoogleCloudPlatform/cloud-bigtable-examples/tree/master/scala/bigtable-shc [2] https://github.com/hortonworks-spark/shc/tree/branch-2.4 [3] Spark-HBase -

Spark-HBase - GCP template (3/3) - Missing libraries?

不羁的心 提交于 2021-01-15 19:44:42
问题 I'm trying to test the Spark-HBase connector in the GCP context and tried to follow the instructions, which asks to locally package the connector, and I get the following error when submitting the job on Dataproc (after having completed these steps). Command (base) gcloud dataproc jobs submit spark --cluster $SPARK_CLUSTER --class com.example.bigtable.spark.shc.BigtableSource --jars target/scala-2.11/cloud-bigtable-dataproc-spark-shc-assembly-0.1.jar --region us-east1 -- $BIGTABLE_TABLE Error

Spark-HBase - GCP template (3/3) - Missing libraries?

99封情书 提交于 2021-01-15 19:41:48
问题 I'm trying to test the Spark-HBase connector in the GCP context and tried to follow the instructions, which asks to locally package the connector, and I get the following error when submitting the job on Dataproc (after having completed these steps). Command (base) gcloud dataproc jobs submit spark --cluster $SPARK_CLUSTER --class com.example.bigtable.spark.shc.BigtableSource --jars target/scala-2.11/cloud-bigtable-dataproc-spark-shc-assembly-0.1.jar --region us-east1 -- $BIGTABLE_TABLE Error