am trying to manage production and development environments using the GCP cloud GUI from the same code base. I have created separate release configurations and workflow configurations based off of the same DAG, with the only difference that the production environment data resides in a schema with _prod as the suffix.
I have accordingly included the correct suffix in the release configuration for production:
However, I have discovered that it does not apply to my source data, which uses `schema: dataform.projectConfig.defaultSchema` in the declaration. Am I configuring my project correctly?
Yes, you are on the right track with your project configuration. The dataform.projectConfig.defaultSchema
setting specifies the default schema that will be used for tables and views created by Dataform, unless otherwise specified. If you want to use a different schema for your source data in the production environment, you'll need to handle it in your Dataform scripts.
To ensure that your source data uses the correct schema for your production environment, you can conditionally set the source schema based on the environment. Here's an example using Javascript:
const isProduction = dataform.projectConfig.defaultSchema.endsWith("_prod");
const sourceSchema = isProduction ? "source_schema_prod" : "source_schema_dev";
Then, use the sourceSchema
variable in your source data declarations.
For your production environment configuration, ensure that the defaultSchema
is set correctly:
{
"defaultSchema": "your_schema_name_prod",
...
}
dataform deploy
command. Dataform will use the specified schema for all tables and views in your production environment.
Thank you for helping me think this through - I actually hit on an easier solution.
I updated the declaration for the source to be
schema: dataform.projectConfig.defaultSchema + dataform.projectConfig.vars.source_suffix
The dataform.json has the variable:
"vars" : {
"source_suffix" : ""
}
For production releases, specify the variable override source_suffix to be _prod.