Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Datastream for Bigquery

I am reaching out for help with two issues we are facing with replicating a source postgresql schema to google bigquery as the destination. 
  1. datatype differences are seen between postgres and bigquery but unfortunately we do not want to modify the source table columns datatypes to match or map into biquery as described here https://cloud.google.com/datastream/docs/destination-bigquery#map_data_types. see also the error message screenshots below. how can we replicate these tables to bigquery without modifying the datatype as this is not an option for us.
     
  2. Primary key issues for tables. Datastream appears to throw invalid errors indicating the tables have more primary keys than they actually exist and for that reason it can't replicate those tables.  image (5).pngimage (4).png
0 1 476
1 REPLY 1

Hi @charlesfru,

Welcome back to Google Cloud Community.

You may try to perform this check to see the issue:

For the first issue. One choice would be to construct a staging schema in BigQuery with the same schema structure as the source schema but with data types that match those in BigQuery if you are unable to change the data types in the source schema to match those in BigQuery. The data from the source schema might then be ingested into the staging schema using a program like Apache NiFi or Apache Beam, changing the data types as necessary. You could utilize Datastream to replicate the data into the BigQuery final destination schema once it is in the staging schema.

For the second issue. It's possible that Datastream is identifying your source schema's primary keys erroneously. You may verify that the primary keys that Datastream expects are present in the source schema's tables. 


Here are some references that might help you:
https://cloud.google.com/datastream/docs/sources

https://cloud.google.com/datastream/docs/sources-postgresql