Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Error starting DataStream

Hi DataStream Team,

We are trying to use DataStream to replicate data from PostgreSQL RDS at AWS to GCP BigQuery. I've setup the Datastream with PostgreSQL database as the source profile. All regional IPs that Stream creation wizard has showed, were added to inbound rules on bastion SSH Tunnel server. Stream, source and destination connections were created in the same region - northamerica-northeast2 (Toronto). I validated that:

1) Connectivity tests pass.
2) Validation tests at the end of Stream creation wizard also pass - 100% green.

When I try to start the Stream I get an error "An unknown error occurred. Please try again. If the error persists, contact Google support."

Questions:

  1. Where can I see the detailed error logs when DataStream starts?
  2. Do I need to configure Error Reporting for DataStream? If so, where can I find a document outlining how to set it up?
  3. Since we are early adopters of GCP & DataStream, I am yet to convince my boss to opt for the non-basic support level, hence I am unable to submit a ticket. Can someone help me troubleshoot the issue?

Thank you.

0 2 1,727
2 REPLIES 2

Depending on the error, information is provided in the Streams or Stream details pages of the Datastream UI. You can also use Datastream's APIs to retrieve information about the error. Here's the documentation for that.

Hi @Joevanie ,

Thank you for the reply. I do not see any reference to the error on the web page. All it says is: "An unknown error occurred. Please try again. If the error persists, contact Google support.". I tried again. Same issue. I contacted Google support ...

The document you provided helps to diagnose connection issues but does not provide any information on how to troubleshoot the failure to start the datastream or how to troubleshoot the backfill failure.

Hence, since I am likely not the first (and not the last 🙄) person who will have to deal with the same issue, I will post below the sequence of steps that I added to our internal company documentation on how to deal with this issue when setting up Datastream and initiating a backfill operation.

Adding new objects manually and resolving backfill failures during bulk object selection process

If you add multiple data tables for replication the DataStream backfill process may fail. You can see it by examining the list of DataStreams currently running. If this happens you may need to add objects to replication manually one-by-one. To do that follow below steps:

  1. Click on the DataStream name to go to Stream details.
  2. Press EDIT and select Edit source configuration option. 
  3. Scroll down and expand the “Select objects to exclude” section. 
  4. Press Ctrl + F to start browser search. Enter the name of the table you want to include in the search box. You should see two entries found on the page.
  5. Go to the search result found under the "Select objects to include" section and select the table to be included in replication.
  6. Go to the “Select objects to exclude” section, find the table in the excluded list and unselect it from there. 
  7. Press the SAVE button.
  8. Observe any error messages and troubleshoot the backfill operation failure if any.

Note: “Select objects to exclude” and "Select objects to include" sections of the Datastream configuration interface do not seem to be "intuitive/smart" or "connected" to each other. In other words, the fact that you selected a table in the "include" section does not mean that it will be automatically removed from the "exclude" section. It took me good 4 hours to figure that out so I hope it helps you avoid similar time waste 😉

Initiate failed backfill manually process.
To resolve the issue with a failed backfill you can try to initiate a manual backfill operation for all tables with failed backfill status by following below steps:

  1. Click on the failed stream name to go to stream details.
  2. Observe the number of objects with failed backfill operation and click on OBJECTS tab.
  3. Select objects with Failed backfill status and click the INITIATE BACKFILL button.
  4. Confirm backfill initiation for selected objects in the popup dialog. 
  5. If backfill status succeeds you should see a Backfill status column value changed to Completed.

LNK if you want to connect so that I can explain in greater details and share screenshots (unable to do it in this chat because I am not allowed to upload images).

Hope it helps future Datastream adopters 😉