Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

WAL archiving events and Archived_wal_log size are not adding up

We recently saw a huge spike in our `Archived_wal_log` size and to debug the reason we created a clone of our Cloud SQL instance and replicated the recent production transaction. (mostly adding and deleting new rows)

PS: we have point in recovery enabled and set for 7 days so huge `Archived_wal_log` is expected. we are just trying to narrow down the reason for spike. 

In the dashboard we can see that the WAL archiving is triggering as per our expectation:

shekharinaccord_0-1712178308855.png

But the `Archived_wal_log` size is not changing in the `Disk storage by Type` chart

shekharinaccord_1-1712178484035.png

Our expectation was that every successful WAL archiving will add 16 MB of data towards `Archived_wal_log` size and we should see a huge bump in this graph.

Questions:

  • Is that expectation incorrect?
  • what is the relationship between WAL archiving and `Archived_wal_log` ?
  • Is there a way to double check the `Archived_wal_log` size using gcloud CLI or some other place in GCP console ?

Thanks !

 

Solved Solved
0 2 1,381
1 ACCEPTED SOLUTION

Yes, Cloud SQL Postgres instances created after January 9, 2023, indeed use Cloud Storage for WAL

The decision to move Cloud SQL instance's WAL storage to Cloud Storage was made several reasons:

  • Cost Efficiency: As you noted, using Cloud Storage for WAL does not incur additional costs for the storage duration limits set by your Cloud SQL plan (up to 35 days for Enterprise Plus and 7 days for Enterprise editions).
  • Scalability and Performance: Cloud Storage offers better scalability and potentially improved I/O performance over traditional disk-based storage, which can be crucial for high transaction environments.
  • Reliability: Storing WAL in Cloud Storage can enhance data durability and availability, reducing risks associated with local disk failures.

Before you proceed with the migration, here are some steps to consider:

  1. Backup and Testing: Ensure that comprehensive backups are in place before starting the migration. Testing the migration in a staging environment would also be advisable to preempt any potential issues.

  2. Monitoring Post-Migration: After migrating to Cloud Storage, closely monitor the performance and storage metrics to ensure everything is functioning as expected without unforeseen costs or performance degradations.

View solution in original post

2 REPLIES 2

We have found one clue that may explain this discrepancy.

As per this document: https://cloud.google.com/sql/docs/postgres/backup-recovery/pitr
All instances created after January 9, 2023 store WAL in Cloud Storage and not disk.

This explains why our production instance show the increase in WAL size on disk and the clones we created this week doesn't show any change although WAL archiving is happening.

To confirm we ran:

`gcloud sql instances describe INSTANCE_NAME` 

and for production it shows

transactionalLogStorageState: DISK

and for the clone

transactionalLogStorageState: CLOUD_STORAGE

Also, it is good to know that cloud storage doesn't cost anything so we are planning move our production log storage to cloud storage as well.

```

For instances having write-ahead logs stored in Cloud Storage, the logs are stored in the same region as the primary instance. This log storage (up to 35 days for Cloud SQL Enterprise Plus edition and seven days for Cloud SQL Enterprise edition, the maximum length for PITR) generates no additional cost per instance.

```

would love a confirmation from any expert here though 🙂 thanks.

we'll update this thread once we move the production logs to cloud storage

Yes, Cloud SQL Postgres instances created after January 9, 2023, indeed use Cloud Storage for WAL

The decision to move Cloud SQL instance's WAL storage to Cloud Storage was made several reasons:

  • Cost Efficiency: As you noted, using Cloud Storage for WAL does not incur additional costs for the storage duration limits set by your Cloud SQL plan (up to 35 days for Enterprise Plus and 7 days for Enterprise editions).
  • Scalability and Performance: Cloud Storage offers better scalability and potentially improved I/O performance over traditional disk-based storage, which can be crucial for high transaction environments.
  • Reliability: Storing WAL in Cloud Storage can enhance data durability and availability, reducing risks associated with local disk failures.

Before you proceed with the migration, here are some steps to consider:

  1. Backup and Testing: Ensure that comprehensive backups are in place before starting the migration. Testing the migration in a staging environment would also be advisable to preempt any potential issues.

  2. Monitoring Post-Migration: After migrating to Cloud Storage, closely monitor the performance and storage metrics to ensure everything is functioning as expected without unforeseen costs or performance degradations.