I\'m experiencing a problem with GCP pubsub where a small percentage of data was lost when publishing thousands of messages in couple seconds.
I\'m logging both
Google Cloud Pub/Sub message IDs are unique. It should not be possible for "some messages [to] taken the message_id
of another message." The fact that message ID 108562396466545 was seemingly received means that Pub/Sub did deliver the message to the subscriber and was not lost.
I recommend you check how your session_id
s are generated to ensure that they are indeed unique and that there is exactly one per message. Searching for the sessionId in your JSON via a regular expression search seems a little strange. You would be better off parsing this JSON into an actual object and accessing fields that way.
In general, duplicate messages in Cloud Pub/Sub are always possible; the system guarantees at-least-once delivery. Those messages can be delivered with the same message ID if the duplication happens on the subscribe side (e.g., the ack is not processed in time) or with a different message ID (e.g., if the publish of the message is retried after an error like a deadline exceeded).