google-cloud-pubsub

Monitoring the status of a google pub/sub submitted job

旧城冷巷雨未停 提交于 2019-12-11 12:54:31
问题 I am new to Google Compute/Google App Engine platform. I am currently migrating a python flask application using celery for async tasks to Google Compute/Google App Engine platform. However in the docs it's written I should use Google Pub/Sub instead of celery. In my application whenever I run an async task I have a page to monitor the status of the job using the same principle as http://blog.miguelgrinberg.com/post/using-celery-with-flask. I have checked the documents for google pub/sub, but

Why the message doesn't to be redelivered?

一个人想着一个人 提交于 2019-12-11 08:35:55
问题 The Acknowledgment Deadline is 10 seconds. When I use asynchronous-pull way to process the message, I don't call message.ack() and message.nack() , wait for the message ack deadline and expect Pub/Sub redeliver this message. After waiting over 10 seconds, the subscriber doesn't receive the message again. Here is my code: subscriber : import { pubsubClient, IMessage, parseMessageData } from '../../googlePubsub'; import { logger } from '../../utils'; const topicName = 'asynchronous-pull-test';

Google Python cloud-dataflow instances broke without new deployment (failed pubsub import)

孤街醉人 提交于 2019-12-11 08:34:59
问题 I have defined a few different Cloud Dataflow jobs for Python in the Google AppEngine Flex Environment. I have defined my requirements in a requirements.txt file, included my setup.py file, and everything was working just fine. My last deployment was on May 3rd, 2018. Looking through logs, I see that one of my jobs began failing on May 22nd, 2018. The job fails with a stack trace resulting from a bad import, seen below. Traceback (most recent call last): File "/usr/local/lib/python2.7/dist

Delete data from BigQuery while streaming from Dataflow

China☆狼群 提交于 2019-12-11 07:44:14
问题 Is it possible to delete data from a BigQuery table while loading data into it from an Apache Beam pipeline. Our use case is such that we need to delete 3 days prior data from the table on the basis of a timestamp field (time when Dataflow pulls message from Pubsub topic). Is it recommended to do something like this? If yes, is there any way to achieve this? Thank You. 回答1: I think best way of doing this setup you table as partitioned (based on ingestion time) table https://cloud.google.com

Google Pubsub Python Client library subscriber crashes randomly

依然范特西╮ 提交于 2019-12-11 07:36:51
问题 Please could someone help me with the Google Pubsub Python Client Library? I am following the tutorial at https://cloud.google.com/pubsub/docs/pull#pubsub-pull-messages-async-python closely and seem to get unprompted errors. I have a simple script called "sendmessage.py" that sends a text message with a random number appended so that I can tell messages apart. The subscriber code runs on a separate compute engine instance and looks like this: from google.cloud import pubsub_v1 def callback

Google Cloud SDK importError: no module named cloud.google

匆匆过客 提交于 2019-12-11 07:31:06
问题 I am new to Linux and trying to run a Python script that needs the following: 'from google.cloud import pubsub' I'm getting the following error: Traceback (most recent call last): File "file.py", line 2, in <module> from google.cloud import pubsub ImportError: No module named google.cloud How can I give access to this module? I have installed Google's Cloud SDK. I assume it has something to do with providing the path to this SDK "module" in some file? 回答1: If this only happened when you

Programmatically terminating PubSubIO.readMessages from Subscription after configured time?

China☆狼群 提交于 2019-12-11 05:13:54
问题 I am looking to schedule the Dataflow which has PubSubIO.readString from a PubSub topic's subscripton. How can i have the job to be terminating after a configured interval? My usecase is not to keep the job running through the entire day, so looking to schedule to start, and then stop after a configured interval from within the job. Pipeline .apply(PubsubIO.readMessages().fromSubscription("some-subscription")) 回答1: From docs: If you need to stop a running Cloud Dataflow job, you can do so by

Public Google Cloud Pub/Sub Topics?

China☆狼群 提交于 2019-12-11 05:07:48
问题 I'd like to play around with Google Cloud Pub/Sub and processing messages in Dataflow. Are there any public data feeds in Pub/Sub that I can use to get started? In the Dataflow WordCount example, input is read from a file in Cloud Storage, gs://dataflow-samples/shakespeare/kinglear.txt . It seems that dataflow-samples is accessible to all projects, which is very convenient for getting started. Is there anything similar for Pub/Sub? 回答1: Currently, Google maintains this public topic projects

Notifying Google PubSub when Dataflow job is complete

微笑、不失礼 提交于 2019-12-11 04:48:36
问题 Is there a way to publish a message onto Google Pubsub after a Google Dataflow job completes? We have a need to notify dependent systems that the processing of incoming data is complete. How could Dataflow publish after writing data to the sink? EDIT: We want to notify after a pipeline completes writing to GCS. Our pipeline looks like this: Pipeline.create(options) .apply(....) .apply(AvroIO.Write.named("Write to GCS") .withSchema(Extract.class) .to(options.getOutputPath()) .withSuffix(".avro

google pub-sub setMaxMessages

吃可爱长大的小学妹 提交于 2019-12-11 01:59:07
问题 I'm using google pubsub to fetch messages synchronously com.google.pubsub.v1.PullRequest.Builder pullRequestBuilder = PullRequest.newBuilder().setSubscription(subscriptionName).setReturnImmediately(returnImmediately); if (maxMessages != 0) { pullRequestBuilder.setMaxMessages(maxMessages); } PullRequest pullRequest = pullRequestBuilder.build(); PullResponse pullResponse = (PullResponse)this.subscriber.pullCallable().call(pullRequest); return pullResponse.getReceivedMessagesList(); I saw in the