Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

GCP Datastream Postgres to BigQuery CDC behaviour

I was initially interested in using Datastreams to set up CDC from Postgres to BigQuery. I was under the assumption that any updates to data in Postgres would create a new row in BigQuery rather than update the existing record. 
When the preview for this feature went live I set up a datastream and am almost certain the behavior matched what I described above. I just recently set up another datastream and it now seems that records in BigQuery are being overwritten rather than having a new record added. 

Is this the expected behavior?
Was the previous behavior a bug?
Is there a way to preserve historical records in BigQuery with datastreams?

@etaim 

0 2 302
2 REPLIES 2

Can you provide the DDL that you have used that caused the overwriting of data? Did you use the same DDL on your previous Datastream setup and your current Datastream setup?

The current intended behavior is *replication*, meaning that the destination matches the source (and changes overwrite the existing rows).

In the future we plan to also support a *change stream* mode which will write each change as a new row in BigQuery, but I can't share an ETA for this feature at this time.