Hello, I've been using Datastream to stream our backend CloudSQL PostgreSQL DB to a BigQuery.
In 2 weeks of usage Datastream stream failed permanently twice with the error :
failed to read from the PostgreSQL replication slot because it is already being used by a different process
Looking at the Postgres instance logs I can correlate this log entry to the Datastream permanent failure :
{
"textPayload": "2022-10-17 14:43:12.896 UTC [219890]: [1-1] db=xxx,user=datatstream_test ERROR: replication slot \"datastream_replication_slot_test\" is active for PID 219872",
"insertId": "...",
"resource": {
"type": "cloudsql_database",
"labels": {
"database_id": "xxx-yyy-zzz:xxx-zzz-instance",
"project_id": "xxx-yyy-zzz",
"region": "us-central"
}
},
"timestamp": "2022-10-17T14:43:12.897056Z",
"severity": "ERROR",
"labels": {
"INSTANCE_UID": "...",
"LOG_BUCKET_NUM": "33"
},
"logName": "projects/xxx-yyy-zzz/logs/cloudsql.googleapis.com%2Fpostgres.log",
"receiveTimestamp": "2022-10-17T14:43:14.520407419Z"
}
Previous logs shows usage of the replication slot with the PID 219872 less than a minute prior. Looking back in the logs it appears to be a normal behaviour that cause no error when the replication slot is called twice with at least a minute and a half of delay.
But two times it wasn’t the case and it made the stream fail permanently.
Is there anything I can do to avoid this happening and thus make it somewhat suitable for prod even though it's in beta ?
Or do you have any other technical solution to share to do the same thing effortlessly ?
Thanks in advance
@equals215 This can be a bug which needs to be reported.
Please Visit
https://issuetracker.google.com/
Create a defect report in this component [ Public Trackers > Cloud Platform > Storage and Databases > Datastream ]
cc: @etaim
The issue will be tracked here