Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

override defaultLocation in dataform.json

Hi Team,

Is there any way to override "defaultLocation" value in dataform.json through dataform UI release/workflow configuration.

Thank you,

Amit

1 4 865
4 REPLIES 4

According to the Google Cloud Dataform documentation, , the defaultLocation cannot be directly overridden through the Dataform UI or workflow configuration. However, you can use the Dataform API to create a compilation result with custom compilation overrides, which can include a different Google Cloud project ID.

Here's how you can do it:

  1. Create a compilationResults.create request. In this request, you need to specify a source for the compilation result. This could be a Dataform workspace or a Git branch, Git tag, or Git commit SHA.

  2. In the CodeCompilationConfig object of the compilationResults.create request, you can configure compilation overrides. For example, to override the defaultLocation, you can set the defaultDatabase property to the desired Google Cloud project ID.

Here's an example of how to override the defaultLocation:

{
"codeCompilationConfig": {
"defaultDatabase": "your-desired-project-id"
}
}

  1. xecute the compilationResults.create request. This will create a compilation result with the specified overrides.

  2. To execute the created compilation result, pass the compilation result ID returned by the compilationResults.create request in a workflowInvocations.create request.

Please note that this method overrides the settings for a single compilation result and does not permanently change the settings in the dataform.json file. For permanent changes, you would need to manually edit the dataform.json file.

 

ValueError: Unknown field for CodeCompilationConfig: defaultDatabase. im getting this error 

client = dataform_v1beta1.DataformClient()
parent = f"projects/{project_id}/locations/{location}/repositories/{repository_id}"
release_config_name = f"{parent}/releaseConfigs/{release_config_id}"

# Create or update invocation config
invocation_config = dataform_v1beta1.InvocationConfig()
if tag_list[0].lower() != "all":
invocation_config.included_tags = tag_list
invocation_config.transitive_dependencies_included = upstream_dependencies
invocation_config.transitive_dependents_included = downstream_dependents
invocation_config.fully_refresh_incremental_tables_enabled = full_refresh
logging.info(f"Invocation Config: {invocation_config}")

release_config = dataform_v1beta1.ReleaseConfig(
name=release_config_name,
git_commitish=release_config_id,
code_compilation_config=dataform_v1beta1.CodeCompilationConfig(
default_schema=dtset,
defaultDatabase=project_id,
# defaultDataset=dtset,
# defaultProject=project_id,
vars={"mod_cod": str(mod_cod), "mod_name": mod_name,"environment" :environment,"prefix":prefix,"suffix":suffix,"event_type":event_type}
)
)

logging.info(f"Release Config: {release_config}")

try:
# Try to create the release configuration
create_request = dataform_v1beta1.CreateReleaseConfigRequest(
parent=parent,
release_config=release_config,
release_config_id=release_config_id
)
response = client.create_release_config(request=create_request)
logging.info(f"Release Config {release_config_id} created successfully")
except AlreadyExists:
logging.info(f"Release Config {release_config_id} already exists, updating instead")
# If it already exists, update the existing release configuration
update_request = dataform_v1beta1.UpdateReleaseConfigRequest(
release_config=release_config,
update_mask={"paths": ["code_compilation_config.vars"]}
)
response = client.update_release_config(request=update_request)

# Executing Workflow
compilation_result = dataform_v1beta1.CompilationResult(
release_config=release_config_name,
)

create_compilation_request = dataform_v1beta1.CreateCompilationResultRequest(
parent=parent,
compilation_result=compilation_result,
)

The error you're encountering, "ValueError: Unknown field for CodeCompilationConfig: defaultDatabase," occurs because defaultDatabase is not a recognized field in the CodeCompilationConfig object. The CodeCompilationConfig object does not support a field named defaultDatabase.

To achieve the functionality you're aiming for (overriding the default project, dataset, etc.), you should use the following supported fields:

  • default_dataset: Use this to override the default dataset.
  • default_schema: Alias for default_dataset.
  • vars: A dictionary to pass variables for template substitution.

If you're trying to override the project ID, this can typically be done using the default_schema in conjunction with specifying the full dataset ID, including the project ID. Here's an updated example based on your provided code:

client = dataform_v1beta1.DataformClient()
parent = f"projects/{project_id}/locations/{location}/repositories/{repository_id}"
release_config_name = f"{parent}/releaseConfigs/{release_config_id}"

# Create or update invocation config
invocation_config = dataform_v1beta1.InvocationConfig()
if tag_list[0].lower() != "all":
    invocation_config.included_tags = tag_list
invocation_config.transitive_dependencies_included = upstream_dependencies
invocation_config.transitive_dependents_included = downstream_dependents
invocation_config.fully_refresh_incremental_tables_enabled = full_refresh
logging.info(f"Invocation Config: {invocation_config}")

release_config = dataform_v1beta1.ReleaseConfig(
    name=release_config_name,
    git_commitish=release_config_id,
    code_compilation_config=dataform_v1beta1.CodeCompilationConfig(
        default_schema=f"{project_id}.{dtset}",
        vars={"mod_cod": str(mod_cod), "mod_name": mod_name, "environment": environment, "prefix": prefix, "suffix": suffix, "event_type": event_type}
    )
)

logging.info(f"Release Config: {release_config}")

try:
    # Try to create the release configuration
    create_request = dataform_v1beta1.CreateReleaseConfigRequest(
        parent=parent,
        release_config=release_config,
        release_config_id=release_config_id
    )
    response = client.create_release_config(request=create_request)
    logging.info(f"Release Config {release_config_id} created successfully")
except AlreadyExists:
    logging.info(f"Release Config {release_config_id} already exists, updating instead")
    # If it already exists, update the existing release configuration
    update_request = dataform_v1beta1.UpdateReleaseConfigRequest(
        release_config=release_config,
        update_mask={"paths": ["code_compilation_config.vars"]}
    )
    response = client.update_release_config(request=update_request)

# Executing Workflow
compilation_result = dataform_v1beta1.CompilationResult(
    release_config=release_config_name,
)

create_compilation_request = dataform_v1beta1.CreateCompilationResultRequest(
    parent=parent,
    compilation_result=compilation_result,
)

If you need more detailed control over project or dataset configurations, consider using the Dataform CLI or managing the dataform.json configurations in your CI/CD pipeline to apply environment-specific settings.

we tried implementing using your suggestion but still its unable to override the Defaultproject and defaultdataset ,defaultassertiondataset

below is the workflow_settings.yaml

defaultProject: staging
defaultLocation: US
defaultDataset: test
defaultAssertionDataset: test
dataformCoreVersion: 3.0.0-beta.4
"vars": {
"environment": "edw_test",
"mod_cod": "83",
"mod_name":"POOL",
"event_type": "Class",
"prefix" : "SBMOD_2",
"suffix" : "V1_2",
}

the above values I want to override  while runtime from the shared code please help here