Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Duplicate Message Issue in Pub-Sub with Multiple Replica

We are encountering an issue in our system where multiple replicas of our service are running on different IPs, and each replica is attempting to subscribe to and process messages concurrently.

  • We have three replicas (Replica-1, Replica-2) running in parallel.
  • Both replicas subscribe to the same message at nearly the same time (with a millisecond gap), resulting in both acknowledging the message and processing the file concurrently.
  • While the message IDs are unique for each replica, both replicas end up processing the same message independently and persisting the data into the database.

The problem arises because only the latest acknowledgment ID is considered by Google’s system, but in our scenario, we require that only one replica should process the message at a time. The current setup leads to multiple independent processing instances for the same message, which results in data inconsistencies.

Any options to achieve this?

0 2 272
2 REPLIES 2

1. Have you already tried Google Cloud Pub/Sub exactly-once delivery settings? Ensure that this feature is enabled on your topic and subscription to prevent multiple replicas from processing the same message.
2. Use Pub/Sub's message ordering feature by assigning ordering keys to messages. This ensures that messages with the same ordering key are delivered to subscribers in the same order and processed by only one replica at a time.
3. Store a lock with the message ID or a unique identifier. Before processing, check if the lock exists. If it does not, acquire the lock, process the message, and release the lock after completion.

In the past I have tried 3rd option in my application to resolve this issue.

Are you using a push or pull subscription?

Top Labels in this Space
Top Solution Authors