google-cloud-pubsub

DataFlow (PY 2.x SDk) ReadFromPubSub :: id_label & timestamp_attribute behaving unexpectedly

旧时模样 提交于 2019-12-08 12:30:28
问题 My apache beam pipeline (using Python SDK+ DirecrRunner for testing purpose…) is reading from Pubsub topic The message & attributes published are as follows: message: [{"col1": "test column 1", "col2": "test column 1"}] attributes:{ 'event_time_v1': str(time.time()), 'record_id': 'row-1’, } I’m using the function beam.io.gcp.pubsub.ReadFromPubSub. The code/doc mentions id_label and timestamp_attribute arguments (I believe these are very new additions?! Updated only 13 days ago..) When I use

How can I get a reliable alert via Stackdriver when there are no clients pulling from a Pub/Sub subscription?

你离开我真会死。 提交于 2019-12-08 05:26:48
问题 I currently have some alerts set up to report when subscription/pull_request_count is 0. However, in a similar question about that metric, I found that metrics and alerting break once there is no data, which I believe happens when there are no subscriptions. My intent is to figure out if my servers have stopped pulling messages. There are 2 scenarios I have in mind where the details are important. Even if there are no messages being published, I want to know if I'm no longer pulling from a

Google Cloud dataflow : Read from a file with dynamic filename

我怕爱的太早我们不能终老 提交于 2019-12-08 04:18:05
问题 I am trying to build a pipeline on Google Cloud Dataflow that would do the following: Listen to events on Pubsub subscription Extract the filename from event text Read the file (from Google Cloud Storage bucket) Store the records in BigQuery Following is the code: Pipeline pipeline = //create pipeline pipeline.apply("read events", PubsubIO.readStrings().fromSubscription("sub")) .apply("Deserialise events", //Code that produces ParDo.SingleOutput<String, KV<String, byte[]>>) .apply(TextIO.read

Securing PubSub push endpoints in node app engine?

大憨熊 提交于 2019-12-08 02:15:30
问题 I'm using pubsub to push messages into an App Engine app written in node on the flexible environment. Is there a way I can limit my endpoints to only traffic from pubsub? In the standard environment, App Engine has handlers that can define admin only requests and secure endpoints. However, this functionality is not available in the flexible environment. Is it possible to set up Firewall rules for only Google requests (Firewall appears to be application wide, not endpoint?), is there a

PubSub REST subscription pull not returning with all messages

自闭症网瘾萝莉.ら 提交于 2019-12-07 19:24:54
问题 We use the REST service API to pull messages from a PubSub subscription. Messages ready to be serviced are acknowledged, leaving other messages unacknowledged to be serviced during a later execution cycle. During an execution cycle, we send a single reqeust to the pull service REST API with returnImmediately=true and maxMessages=100 . While testing we encountered a situation when only 3 "old" messages would be returned during each execution cycle. Newly published messages were never included

Does Google cloud Pub/Sub Support Letsencrypt certificates?

坚强是说给别人听的谎言 提交于 2019-12-07 19:03:57
问题 This question is a follow up to Unable to configure Google Cloud Pub/Sub push subscriber As per the answer, self-signed certificates are not supported when using a push subscriber. Are certificates generated via letsencrypt client supported ? Is it recommended to use letsencrypt ? 回答1: Yes. Sorry it took a long time, but I have verified that letsencrypt certs are working fine for this purpose. 来源: https://stackoverflow.com/questions/33790132/does-google-cloud-pub-sub-support-letsencrypt

Apache Beam PubSubIO with GroupByKey

╄→гoц情女王★ 提交于 2019-12-07 14:23:05
问题 I'm trying with Apache Beam 2.1.0 to consume simple data (key,value) from google PubSub and group by key to be able to treat batches of data. With default trigger my code after "GroupByKey" never fires (I waited 30min). If I defined custom trigger, code is executed but I would like to understand why default trigger is never fired. I tried to define my own timestamp with "withTimestampLabel" but same issue. I tried to change duration of windows but same issue too (1second, 10seconds, 30seconds

Google Cloud dataflow : Read from a file with dynamic filename

我怕爱的太早我们不能终老 提交于 2019-12-07 12:02:31
I am trying to build a pipeline on Google Cloud Dataflow that would do the following: Listen to events on Pubsub subscription Extract the filename from event text Read the file (from Google Cloud Storage bucket) Store the records in BigQuery Following is the code: Pipeline pipeline = //create pipeline pipeline.apply("read events", PubsubIO.readStrings().fromSubscription("sub")) .apply("Deserialise events", //Code that produces ParDo.SingleOutput<String, KV<String, byte[]>>) .apply(TextIO.read().from(""))??? I am struggling with 3rd step, not quite sure how to access the output of second step

Dataflow pipeline and pubsub emulator

无人久伴 提交于 2019-12-07 08:02:37
问题 I'm trying to setup my development environment. Instead of using google cloud pubsub in production, I've been using the pubsub emulator for development and testing. To achieve this I set the following environment variable: export PUBSUB_EMULATOR_HOST=localhost:8586 This worked for the python google pubsub library but when I switched to using java apache beam for google dataflow, the pipeline still points to production google pubsub. Is there a setting, environment variable or method on the

Google Pubsub: UNAVAILABLE: The service was unable to fulfill your request

回眸只為那壹抹淺笑 提交于 2019-12-07 07:20:42
问题 I am using the java library to subscribe to a subscription from my code. Using sbt: "com.google.cloud" % "google-cloud-pubsub" % "0.24.0-beta" I followed this guide to write a subscriber: https://cloud.google.com/pubsub/docs/pull val projectId = "test-topic" val subscriptionId = "test-sub" def main(args: Array[String]): Unit = { val subscriptionName = SubscriptionName.create(projectId, subscriptionId) val subscriber = Subscriber.defaultBuilder(subscriptionName, new PastEventMessageReceiver())