Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Deduplication if subscriber application is not idempotent and we can't change it.

If Subscriber application is legacy and not idempotent, Can we handle deduplication (in case of redelivery, acknowledgement deadline,  etc ) somehow without changing subscriber application.
Can we still use pubsub in such scenario.

Went through https://cloud.google.com/pubsub/docs/exactly-once-delivery link but redelivery is also a trouble for subscriber application

Solved Solved
1 11 4,261
1 ACCEPTED SOLUTION

Correction 4/3/2024:

Google Cloud Pub/Sub provides a few features that can help with handling message deduplication and redelivery, even when dealing with a legacy, non-idempotent subscriber application.

  1. Message Deduplication: Pub/Sub provides automatic deduplication of published messages on a best-effort basis. If you publish a message with the same ordering key and message ID as a previously published message, and the previous message has not yet been acknowledged, then Pub/Sub understands that this is a duplicate message and does not deliver it to subscribers.   Please note: The purpose of ordering keys is not to eliminate duplicate messages, but rather to ensure that messages sharing the same key are delivered in a specific sequence. However, it's important to note that Google Pub/Sub does offer a feature known as "exactly once delivery" (you can read more about it here: https://cloud.google.com/pubsub/docs/exactly-once-delivery). This feature ensures that messages are not delivered more than once if they have not been acknowledged, thereby providing a more reliable handling of acknowledgments and message delivery.

    For effectively removing duplicate messages, as you've correctly identified, using Dataflow is the recommended approach. I've also taken the step to request an update from the author on the Google Community answer to reflect this clarification.

  2. Adjusting the Acknowledgement Deadline: One way to handle redelivery is to adjust the acknowledgement deadline. If your subscriber application needs more time to process a message, you can modify the acknowledgement deadline for a specific message. This prevents the message from being redelivered during this extended deadline.

  3. Dead Letter Topics: If a message is redelivered more times than you specify, Pub/Sub can send that message to a dead letter topic. This can help you identify problematic messages that your subscriber application is unable to process for some reason.

However, these methods still require the subscriber application to acknowledge the message once it has been processed. If the subscriber application is not capable of acknowledging messages, or if it cannot handle redelivery of messages at all, then using Pub/Sub may be more challenging.

In such cases, you might consider adding a middleware layer between Pub/Sub and the subscriber application. This middleware could handle the acknowledgement and redelivery logic. This would allow the legacy subscriber application to operate as it currently does, while the middleware layer handles the complexities of dealing with message deduplication and redelivery.

Keep in mind that this approach would require development and maintenance of this middleware layer, which could increase complexity. It's also important to ensure that this layer is designed and implemented to handle potential failure scenarios, to avoid data loss or duplication.

Finally, it's worth noting that while Pub/Sub provides many features for handling message delivery and redelivery, it may not be the best fit for every use case. Depending on the specific requirements of your legacy application, other messaging or event-driven architectures may be more appropriate. 

View solution in original post

11 REPLIES 11