Hi, I'm currently working on the multi-cloud integration in my company.
To achieve this, we started to connect data from AWS into GCP through Bigquery Omni. For Data governance reasons and to get the best control in the process we started to use Dataform.
I'm trying configuring an external table using the operators but when we try to execute, I get the following error.
We did enough roles to the Service agent but it continues failing.
The AWS region we are configuring is "aws-east-1". Also when we copy and execute the query generated by Dataform in the bigquery console it works well.
Best regards,
Hi @jaime_parra,
Welcome to Google Cloud Community!
The error you're encountering with Dataform and BigQuery Omni accessing data in AWS probably comes down to insufficient permissions. BigQuery Omni requires correctly configured IAM roles in both GCP and AWS to allow the Dataform service account to access the data.
You may review the following consideration as they may have the reason why you are getting a failed status.
Ensure your Dataform configuration correctly points to your BigQuery Omni connection and that the connection itself is properly established.
Here are some helpful links:
For a deeper understanding, I suggest contacting Google Cloud support for assistance.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
Hi @NorieRam
I tell you that all the permission validations and so on were done and we realized:
1. If we run a process where the region is US, us-east1, the materialization of the views or tables works without incident.
2. If we want to create an external table or view in aws, even if we define the aws-us-east-1 region in the configuration file, it fails due to the error shared in the image.
3. The bigquery omni permissions are correct because the users (including me) can read the sources and create them from the console without problem.
The conclusion is that it is the dataform service that does not work when we do an execution process inside the service.
Finally, at the permissions level, even the dataform service agent was given Owner permissions on the project and it still failed. Although I reiterate that it only happens when you are going to create an external table in the omni aws dataset, in the other cases it works normally.
Thank you and I look forward to any feedback you can give me.
@jaime_parra I suggest filing an issue request regarding this, so that our Engineering Team can investigate further. Before filing, please take note on what to expect when opening an issue. For future updates, I recommend monitoring the issue tracker.
Hi @NorieRam
Thank you for your response and recommendations. We have conducted several tests and identified an interesting behavior that led us to pinpoint an issue with scheduled executions in Dataform.
We have configured Dataform to work with BigQuery Omni on AWS (aws-us-east-1), setting the following parameters in workflow_settings.yaml:
Additionally, when executing queries from Workspace, external tables in BigQuery Omni are successfully created, and we were able to deploy some views by adjusting the YAML configuration with location=aws-us-east-1.
Here’s an example of a view that we tested and confirmed to be working correctly:
config {
type: "operations"
}
CREATE OR REPLACE VIEW
`aws_omni_view.prueba` AS
SELECT
REGEXP_EXTRACT(_FILE_NAME, r'/([^/]+)/[^/]+\.parquet$') AS subfolder,
PARSE_DATE('%Y%m%d', REGEXP_EXTRACT(_FILE_NAME, r'\d{8}')) AS partition_date
FROM
`aws_omni_campanas.prueba`
We also confirmed that executions from the BigQuery Console run correctly and that data is created in the expected region (aws-us-east-1).
Furthermore, when reviewing BigQuery Job History, we observed that manual executions from Workspace are indeed running in the configured region (aws-us-east-1), indicating that the YAML settings are being applied correctly within the Workspace environment.
Even though executions from Workspace behave as expected, when we schedule a Dataform execution, we receive an error.
1️⃣ The workflow_settings.yaml file only affects the Workspace but does not control the actual execution of scheduled workflows.
2️⃣ The release_config was set to us-east1 with target: bigquery.
3️⃣ There was no properly configured release_config for aws-us-east-1.
📄 Evidence from Logs:
By analyzing the execution logs, we found that the releaseConfigId is "omni" and it is executing in "location": "us-east1", confirming that Dataform jobs are being executed in the wrong region.
{
"insertId": "geq081ch6v",
"jsonPayload": {
"@type": "type.googleapis.com/google.cloud.dataform.logging.v1.WorkflowInvocationCompletionLogEntry",
"releaseConfigId": "omni",
"workflowInvocationId": "1741637475-a4f89d69-7cf4-476b-82cd-b5c7a2ebd314",
"terminalState": "FAILED",
"workflowConfigId": "naruto"
},
"resource": {
"type": "dataform.googleapis.com/Repository",
"labels": {
"location": "us-east1", <--- ⚠️ EXECUTING ON GCP INSTEAD OF AWS Omni
"resource_container": "626537586202",
"repository_id": "augusta-bavv-bigquery-omni-aws"
}
},
"timestamp": "2025-03-10T20:11:15.529842342Z",
"severity": "ERROR",
"logName": "projects/augusta-bavv-dev-activo/logs/dataform.googleapis.com%2Fworkflow_invocation_completion",
"receiveTimestamp": "2025-03-10T20:11:16.071909743Z"
}
To resolve this issue, we believe that:
✅ A release_config must be created and properly configured for aws-us-east-1 so that scheduled executions in Dataform use the correct region instead of defaulting to GCP.
Could you provide guidance on the best approach to configure the release_config in this scenario to ensure that scheduled executions respect the AWS Omni configuration?
We appreciate any additional recommendations.