Hi everyone,
I’m new to Dataform, and I noticed that the official documentation mainly focuses on data transformation using BigQuery. However, I’ve read that it’s possible to connect Dataform with other cloud data warehouses, but I haven’t found any relevant examples. I’d like to ask if Dataform can connect to other data warehouses or databases, such as Snowflake, MySQL, etc. If so, could you please provide some examples or references?
Thank you so much for your help!
Solved! Go to Solution.
Hello,
Thank you for your engagement regarding this issue. We haven’t heard back from you regarding this issue for sometime now. Hence, I'm going to close this issue which will no longer be monitored. However, if you have any new issues, Please don’t hesitate to create a new issue. We will be happy to assist you on the same.
Regards,
Jai Ade
Dataform allows direct integration with supported warehouses such as Snowflake. To connect, you configure the dataform.json file with the necessary Snowflake account details, including the warehouse, database, and schema. Once connected, you can perform transformations within Snowflake using SQLX scripts.
However, for MySQL, Dataform does not offer direct support. Instead, you can use external scripts or tools to bridge the gap. For example, you might pull data from MySQL into BigQuery or another supported warehouse where Dataform can then perform the necessary transformations. Alternatively, BigQuery external tables could be used to reference MySQL data, enabling indirect interaction.
While Dataform excels in transforming data within cloud data warehouses like Snowflake, its direct integration with traditional databases like MySQL is limited, requiring additional steps to achieve similar functionality.
Thanks for your answer @ms4446. Your response has been incredibly helpful to me. From what I understand, after Dataform core 3.0.0, dataform.json was replaced workflow_setting.yml. Can workflow_setting.yml also be used to connect to Snowflake in the same way?
Additionally, besides Snowflake, which other databases can Dataform connect to? Is it possible to connect to all cloud-based databases?
You're correct that in Dataform core 3.0.0, the dataform.json configuration file was replaced by workflow_setting.yml. This new configuration file continues to serve the purpose of defining your project's settings, including connections to various databases.
Yes, you can connect to Snowflake using the workflow_setting.yml file in a similar manner to how you would have done with dataform.json. You would specify the necessary Snowflake connection details, such as the account, warehouse, database, and schema, within the YAML file.
Besides Snowflake, Dataform supports connections to several other cloud data warehouses, including:
While Dataform excels in integrating with these popular cloud-based data warehouses, it does not support direct connections to all cloud-based databases or traditional relational databases like MySQL, PostgreSQL, or Oracle.
For databases that Dataform does not natively support, you would need to use workarounds, such as:
Hello,
Thank you for your engagement regarding this issue. We haven’t heard back from you regarding this issue for sometime now. Hence, I'm going to close this issue which will no longer be monitored. However, if you have any new issues, Please don’t hesitate to create a new issue. We will be happy to assist you on the same.
Regards,
Jai Ade