Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Datastream has error: the replication slot replication_slot status is 'lost' (wal_status)

Hello, I have created a stream to migrate data from Cloud SQL Postgresql to BigQuery, and it has been working pretty well for a few months.

But today I got the error which says that "The replication slot replication_slot status is 'lost' (wal_status). Make sure that the slot is active and try again."

I checked replication status by using the below query

 

SELECT * FROM pg_replication_slots;

 

Screenshot 2023-09-18 at 12.07.09.pngI wonder how could I fix the error properly (e.g increase max_wal_size, and re-create a new

replication slot?) Thanks in advanced!
Solved Solved
0 2 4,934
1 ACCEPTED SOLUTION

To fix the error "The replication slot replication_slot status is 'lost' (wal_status). Make sure that the slot is active and try again." when migrating data from Cloud SQL Postgresql to BigQuery, you can try the following:

  1. Increase the max_wal_sizeparameter. This parameter controls the amount of disk space that the WAL can consume before old WAL files are removed. If the max_wal_size parameter is too low, PostgreSQL may delete WAL data before it has been streamed to BigQuery, which can cause the replication slot to become lost. To increase the max_wal_sizeparameter, you can edit the configuration file for your Cloud SQL instance or you can use the following SQL command:
ALTER SYSTEM SET max_wal_size = '10GB';

Note: The ALTER SYSTEM SET max_wal_size command will only take effect after a server restart.

  1. Drop the old replication slot and recreate a new one. To do this,you can use the following SQL commands:
DROP REPLICATION SLOT replication_slot;
CREATE REPLICATION SLOT replication_slot LOGICAL;

Note: Make sure to replace replication_slot with the actual name of the replication slot you want to create.

  1. Check the Cloud SQL logs for any errors. You can access the Cloud SQL logs in the Google Cloud Console. To do this, go to the Cloud SQL page and select your instance. Then, click on the Logs tab.

  2. Monitor the system closely to identify any patterns or underlying problems. If the replication slot is frequently getting lost or becoming inactive, it may indicate a deeper issue with the setup.

Additional tips:

  • Make sure that the wal_levelparameter is set to logical. This is required for logical replication and replication slots to work.
  • If you are using a replication slot to migrate data to BigQuery, make sure that the BigQuery stream is configured to use the correct replication slot.

View solution in original post

2 REPLIES 2

To fix the error "The replication slot replication_slot status is 'lost' (wal_status). Make sure that the slot is active and try again." when migrating data from Cloud SQL Postgresql to BigQuery, you can try the following:

  1. Increase the max_wal_sizeparameter. This parameter controls the amount of disk space that the WAL can consume before old WAL files are removed. If the max_wal_size parameter is too low, PostgreSQL may delete WAL data before it has been streamed to BigQuery, which can cause the replication slot to become lost. To increase the max_wal_sizeparameter, you can edit the configuration file for your Cloud SQL instance or you can use the following SQL command:
ALTER SYSTEM SET max_wal_size = '10GB';

Note: The ALTER SYSTEM SET max_wal_size command will only take effect after a server restart.

  1. Drop the old replication slot and recreate a new one. To do this,you can use the following SQL commands:
DROP REPLICATION SLOT replication_slot;
CREATE REPLICATION SLOT replication_slot LOGICAL;

Note: Make sure to replace replication_slot with the actual name of the replication slot you want to create.

  1. Check the Cloud SQL logs for any errors. You can access the Cloud SQL logs in the Google Cloud Console. To do this, go to the Cloud SQL page and select your instance. Then, click on the Logs tab.

  2. Monitor the system closely to identify any patterns or underlying problems. If the replication slot is frequently getting lost or becoming inactive, it may indicate a deeper issue with the setup.

Additional tips:

  • Make sure that the wal_levelparameter is set to logical. This is required for logical replication and replication slots to work.
  • If you are using a replication slot to migrate data to BigQuery, make sure that the BigQuery stream is configured to use the correct replication slot.

Hi @ms4446 Thank you for your suggestion!!, I think it works