Data Fusion doesn't have a direct mechanism to back up the entire instance itself. However, you can back up and restore the essential components of your Data Fusion pipelines using the following methods:
Backing Up Pipeline Metadata:
Export Pipelines: You can export your pipelines as JSON files from the Data Fusion UI or using the CLI. This will save the pipeline structure, configuration, and dependencies.
gcloud data-fusion instances export-instance-pipeline \
--location=<LOCATION> \
--instance=<INSTANCE_ID> \
--pipeline-name=<PIPELINE_NAME> \
--output=<OUTPUT_FILE_PATH>
Version Control: Store the exported JSON files in a version control system (like Git) for safekeeping and to track changes.
Backing Up Pipeline Artifacts (Optional):
Restoring from Backups:
Pipelines: Import the JSON files back into Data Fusion to recreate the pipelines.
gcloud data-fusion instances import-instance-pipeline \
--location=<LOCATION> \
--instance=<INSTANCE_ID> \
--input=<INPUT_FILE_PATH>
Plugins: Reinstall any custom plugins from the backed-up JAR files.
Libraries: Ensure the necessary libraries are accessible to your Data Fusion instance.
Important Considerations:
Additional Tips:
Official Documentation:
Refer to the official documentation for more details on backing up and restoring instance data in Cloud Data Fusion:
Cloud Data Fusion Backup and Restore: https://cloud.google.com/data-fusion/docs/concepts/restore-instance-data
Data Fusion doesn't have a direct mechanism to back up the entire instance itself. However, you can back up and restore the essential components of your Data Fusion pipelines using the following methods:
Backing Up Pipeline Metadata:
Export Pipelines: You can export your pipelines as JSON files from the Data Fusion UI or using the CLI. This will save the pipeline structure, configuration, and dependencies.
gcloud data-fusion instances export-instance-pipeline \
--location=<LOCATION> \
--instance=<INSTANCE_ID> \
--pipeline-name=<PIPELINE_NAME> \
--output=<OUTPUT_FILE_PATH>
Version Control: Store the exported JSON files in a version control system (like Git) for safekeeping and to track changes.
Backing Up Pipeline Artifacts (Optional):
Restoring from Backups:
Pipelines: Import the JSON files back into Data Fusion to recreate the pipelines.
gcloud data-fusion instances import-instance-pipeline \
--location=<LOCATION> \
--instance=<INSTANCE_ID> \
--input=<INPUT_FILE_PATH>
Plugins: Reinstall any custom plugins from the backed-up JAR files.
Libraries: Ensure the necessary libraries are accessible to your Data Fusion instance.
Important Considerations:
Additional Tips:
Official Documentation:
Refer to the official documentation for more details on backing up and restoring instance data in Cloud Data Fusion:
Cloud Data Fusion Backup and Restore: https://cloud.google.com/data-fusion/docs/concepts/restore-instance-data
Hello,
What version of gcloud are you using? I've tried with the actual latest (482.0.0) but data-fusion is beta command only and "export-instance-pipeline" and "import-instance-pipeline" command don't exist.
Is there a way to backup plugins and custom libraries? Or should I have the JAR files in my pc?
Regards