Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Dynamically assign a "defaultDatabase" in the dataform.json file

Hello Community,

I have the following situation in our project.
We are using Dataform to transform some data that we ingest in our BQ datasets and to further expose those data as views and event payloads to our customers,

We do have 3 GCP Environments: dev - development, test - testing with production data and prod - production, all 3 of them associated with the same GitHub Repository.

 

Our issue is handling the "defaultDatabase" in the dataform.json file. during the CI/CD process.

Please see our dataform.json file:

{
"defaultSchema": "raw_data",
"assertionSchema": "dataform_assertions",
"warehouse": "bigquery",
"defaultDatabase": "gcp-dev",
"defaultLocation": "EU"
}

This works fine in DEV but it does not work anymore when merging the dev branch to test and test to prod branch.

We need to manually adjust the   "defaultDatabase" to "gcp-test" in test branch and "defaultDatabase" to "gcp-prod" in prod branch which does not adhere to CI/CD best practices.

Is it any elegant way yo dynamically assign a project id to the "defaultDatabase" entry?
Somehow to use an environment variable or something like that for the "defaultDatabase" entry.
Like "defaultDatabase" : ${gcpProjectId}.

If you have done this an example would be much appreciated.

Thank you and Best regards,
Valentin

 

0 1 149
1 REPLY 1

You can dynamically set the "defaultDatabase" in your `dataform.json` using an environment variable $GCP_PROJECT_D value by running a `jq` command in your CI/CD pipeline to update the JSON file, like this:

```bash
jq --arg db "$GCP_PROJECT_ID" '.defaultDatabase = $db' dataform.json > temp.json && mv temp.json dataform.json
```