Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Improve speed of CloudSQL for postgres to AlloyDB migration

I'm migrating a 3.5 TB CloudSQL for Postgres server to AlloyDB.

Everything looks ok, but it's copying at about ~2 GB/minute. That means the whole migration takes ~29 hours. Both servers are on the same region and I'm using maximum parallelism.

I'm migrating a clone, so there's no other activity on the source other than the migration.

Do I have any levers to increase the speed of the migration?

Thanks!

2 REPLIES 2

Migrating a 3.5 TB Cloud SQL Postgres instance to AlloyDB can be complex, but by employing the following strategies, you can enhance the speed and efficiency of the migration process:


1. Optimize Network Throughput and Instance Resources

  • Increase Network Bandwidth:

    • Ensure high-bandwidth network connections between Cloud SQL, DMS, and AlloyDB. Use VPC Flow Logs to monitor and optimize network throughput.

  • Scale Up Source Instance:

    • Temporarily increase the vCPU and memory resources of your Cloud SQL Postgres instance to improve performance during the migration. Adjust instance parameters such as max_connections, maintenance_work_mem, and work_mem. If feasible, switch to SSD storage for faster data extraction.

  • Scale Up AlloyDB Instance:

    • Make sure your AlloyDB instance has sufficient resources (vCPUs, memory) to handle incoming data. Adjust settings like max_parallel_workers and shared_buffers for optimal performance.


2. Fine-tune DMS Settings

  • Parallelism:

    • Verify that DMS is utilizing the maximum parallelism supported by your instances and network capabilities.

  • Batch Size:

    • Experiment with max_batch_size to determine the optimal setting for data transfer without overloading the network or resources.

  • Connection Settings:

    • Optimize connection pool size and network timeouts. Adjust settings such as max_connections and idle_in_transaction_session_timeout to ensure a stable, low-latency connection between Cloud SQL and DMS.


3. Schema and Data Optimization

  • Index Optimization:

    • Optimize indexes on your Cloud SQL Postgres instance before migration. Consider using tools like pg_repack to reorganize tables and reduce fragmentation.

  • Data Pruning:

    • Remove unnecessary data from the Cloud SQL instance to reduce the amount of data to be transferred.

  • Table Partitioning:

    • Partition large tables if applicable, to facilitate parallel data loading in AlloyDB and enhance migration performance.


4. Alternative Migration Strategies

  • pg_dump/pg_restore:

    • Use pg_dump with parallelization (-j option) and compression (-Z option) to create a logical backup. Use pg_restore to efficiently load data into AlloyDB.

  • Custom Scripts:

    • Develop custom scripts using tools like COPY for bulk data loading. This can provide more control over the migration process and potentially speed it up.


5. Additional Tips

  • Monitoring:

    • Use Cloud Monitoring to track resource utilization (CPU, memory, network) on both the source and destination instances. Adjust instance sizes or configurations as needed based on monitoring insights.

  • DMS Logs:

    • Regularly analyze DMS logs for insights into potential issues or bottlenecks. This can help identify and address any migration slowdowns.

  • Google Cloud Support:

    • Reach out to Google Cloud Support for expert guidance and troubleshooting assistance. They can provide specific recommendations based on your environment.

Thank you for the detailed response! you mentioned touching the `max_batch_size` parameter on DMS. Where can that be done? I've been looking at the CLI documentation for the job creation and for the connection profile creation but none of them has that setting.

I've created the job with the Maximum level of parallelization. I've also increased the number of max replication slots and number of parallel workers on the source, but there are only 2 replication slots on the source. The CPU and memory are pretty much free both at the source and the destination.

Thanks!