Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Speedup the large CSV Data import from Bucket to MySQL on Cloud SQL

Hi,

I'm loading large csv file from storage bucket to cloudsql (mysql 8 instance). CSV file contains 40 million records and file size is 10gb.  I used gcloud sql import command to load the data, however it took more than 3hrs to import the data. Is there any recommendation to speed up the import process. Any suggestions to optimize the import process. 

Thank You

0 1 636
1 REPLY 1

Optimizing the `gcloud sql import` Command:

1. Turn off Binary Logging (First Gen instances only):
- Binary logging can slow down imports. If not needed for replication, consider temporarily turning it off during the import. Refer to the [Cloud SQL documentation](https://cloud.google.com/sql/docs/mysql/import-export/import-export-csv) for instructions.

2. Switch to Asynchronous Writes (all instances):
- By default, Cloud SQL uses synchronous writes. Switching to asynchronous writes can improve performance, but may lead to temporary data inconsistencies. Remember to switch back after import.

Optimizing the Cloud SQL Instance:

1. Increase Instance Tier:
- Temporarily upgrade your Cloud SQL instance to a higher tier with more resources. This can significantly improve import speed.

2. Ensure Sufficient Disk Space:
- Maintain at least 25% free disk space on your Cloud SQL instance to avoid autogrowth issues affecting performance.

Alternative Approaches:

1. Cloud Dataflow:
- For very large datasets, consider using Cloud Dataflow, a managed service for data processing. It can handle large-scale data pipelines efficiently.

2. Database Migration Service:
- Explore the Database Migration Service if migrating from another database system. It's designed for efficient data movement between various databases.

Additional Tips:

1. Pre-process the CSV File:
- Ensure the CSV file is in the correct format and sorted by the primary key (if applicable) before importing. This can improve import efficiency.

2. Monitor the Import:
- Use Cloud SQL logs or monitoring tools to track import progress and identify any bottlenecks.

By implementing these recommendations and tips, you should be able to optimize the import process and potentially reduce the time required to import your large CSV file into Cloud SQL.