Google Cloud Pub/Sub with different message types

问题

Within the same application I send different message types that have a completely different format and that are totally unrelated. What is the best practice to tackle this problem?

I see two different approaches here :

Filter at application level, which means I receive all messages on the same puller (same subscription)
Create a new subscription, this means the application will have two pullers running (one for each message type)

回答1:

You anwsered your question with 2. point. If the message format have completely different format and that are totally unrelated, that means they should be separated. There's no advantage of filtering them on application layer. Topics/subscriptions model is made exactly for this purpose.

The difference between topic and a subscription might be confusing. So let me describe that as well.

First the concepts Pub Sub:

Topic: A named resource to which messages are sent by publishers. In a pub/sub model, any message published to a topic is immediately received by all of the subscribers to the topic.
Subscription: A named resource representing the stream of messages from a single, specific topic, to be delivered to the subscribing application.
Message: The combination of data and (optional) attributes that a publisher sends to a topic and is eventually delivered to subscribers.
Message attribute: A key-value pair that a publisher can define for a message.

This diagram demonstrates Pub/Sub model

The Publish Subscribe model allows messages to be broadcast to different parts of a system asynchronously. A sibling to a message queue, a message topic provides a mechanism to broadcast asynchronous event notifications, and endpoints that allow software components to connect to the topic in order to send and receive those messages. To broadcast a message, a component called a publisher simply pushes a message to the topic. Now the difference between topic and subscription is a topic can have multiple subscriptions, but a given subscription belongs to a single topic.

To sum up:

Use a Topic when you would like to publish messages.
Use a Subscription when you would like to consume messages.

回答2:

It depends!! As always, but here it depends how the messages are consumed.

If they are consumed by the same application, use the same subscription.
If the message are consumed by different application (because the message are unrelated and with a different structure) use 2 subscriptions.

Use the message attribute to differentiate the message type. Thanks to this attribute, you can create subscription that accept only these type of message. Like this, you can keep the same topic, and you customize the dispatch afterward. I wrote an article on this

回答3:

There are three ways you can approach this problem:

Publish the messages of different types to different topics, then create a subscription for each topic, and consume the messages from each subscription.
Publish the messages of different types to the same topic, create a single subscription, and consume all of the messages from the single subscription.
Publish the messages of different types to the same topic, create a two subscriptions, and filter messages by type on a subscriber for each subscription.

There are tradeoffs with these three options. If you have control over the publisher and can create entirely separate topics for the different message types, this can be a good approach as it keeps different types of messages on completely independent channels. Think of it like having a data structure with a more specific type specified. For example, in Java one would generally prefer a List<String> and List<Integer> over a List<Object> that contains both.

However, this approach may not be feasible if the publisher is owned by someone else. It may also not be feasible if the subscriber has no way of knowing all of the topics that it could be necessary to consume from. Imagine you add another type of message and create a new topic. Processing it would require creating another subscriber. If the number of types of messages could grow very large, you could find yourself with many subscriber clients in a single task.

If choosing between the second and third option, the decision depends on your consumption patterns. Is it the same application that needs to process messages of both types or would it make sense to split this into separate applications? If it could make sense to have separate applications, then separate subscriptions is a good way to go. If the published messages have a way to distinguish their type in the attributes, then you could potentially use Pub/Sub filtering to ensure that subscribers for each subscription only receive the relevant messages.

If all messages are always going to be consumed by the same application, then a single subscription probably makes the most sense. The biggest reason for this is cost: if you have two subscriptions and two subscribers, that means all messages are going to be delivered and paid for twice. With a single subscription and distinguishing between the messages done at the application level, messages are only delivered once (modulo Cloud Pub/Sub's at-least-once delivery guarantee). This last option is particularly useful if the set of message types is unknown to the subscriber and could grow over time.

So if you have control over the publisher and the set of messages can be known in advance, separate topics for each message type is the best option. If that is not the case and processing of the messages could be done by different applications, then different subscriptions using filters is the best option. If processing of all message types will always be done by the same application or the number of types could grow, a single subscription is the best option.

来源：https://stackoverflow.com/questions/63842401/google-cloud-pub-sub-with-different-message-types

标签

google-cloud-platform

publish-subscribe

google-cloud-pubsub