Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

How to identify duplicate OBJECT_FINALIZE events in Cloud Storage Pubsub Notifications?

Is there an example, or a recommended way to deduplicate messages that come from Cloud Storage Pubsub Notifications? I'm specifically interested in identifying duplicates of event type "Object Finalize" (https://cloud.google.com/storage/docs/pubsub-notifications#events), namely new objects being created.

I'm expecting the primary key to be a combination of fields like (id, generation), but not sure as of now. I found some useful info in this blog post titled Handling duplicate data in streaming pipelines using Dataflow and Pub/Sub but since I'm not using Dataflow or BigQuery I need to implement this myself. The couple of images I've posted from that blog point to the fact that I need to implement this in a custom way.

pramodbiligiri_0-1660228359485.png

pramodbiligiri_1-1660229495012.png

 

0 2 846
2 REPLIES 2