Scenario Description:
Upon the creation of a new development workspace, named as "DEV_Env", its primary function is to serve as a development environment. the configuration overrides for the suffix have been already pre-configured to reflect the workspace name, denoted as ("${workspaceName}").
However, an issue arises when I access the "DEV_Env" workspace and try to create a table. The expectation is to regenerate any tables sourced from the master branch within the dataset named "projectid.dataform_DEV_Env". Yet, the system returns an error message stating that the specified dataset cannot be located within the EU region. in Legacy Dataform such an issue ist not exist.
The error message "EU Dataset Not Found Error during Table Creation in DEV_Env Workspace" indicates that Dataform is unable to find the dataset in the EU region. This could be because the dataset hasn't been created yet, or it's located in a different region.
To resolve this issue, you can create the dataset manually using the BigQuery web UI:
After creating the dataset, verify its location and ensure you have the necessary permissions to access it. Once verified, you should be able to create tables in it using Dataform without any problems.
Thank you for your guidance on manually creating the dataset in the EU region.
While this approach does address the immediate problem, it raises a concern about scalability and efficiency. Every time a new development workspace is instantiated, it would necessitate manual intervention to create the associated dataset. This is not a sustainable or efficient practice, especially when dealing with frequent development cycles or multiple developers.
Furthermore, another challenge that has emerged relates to table structures. In our standard setup, the "Staging" dataset contains tables like 's_table1', 's_table2', etc. When creating a new development workspace and its corresponding dataset, say dev_env, I would expect the new dataset staging_dev_env to automatically replicate the table structure from the original "Staging" dataset. However, this doesn't seem to be happening.
Given these challenges, is there a way to streamline the process? Ideally, I'd like a solution where both datasets and their respective tables are automatically created and structured correctly when a new development workspace is initialized. This would greatly enhance our development efficiency and reduce manual overhead.
I appreciate your continued assistance and look forward to a more integrated solution.
This depends where you are seeing it - if you have created the workspace and you are seeing this error on the validation panel (to the right of the file) - it probably doesn't exist, and it won't be created until you attempt to execute an actual workflow via the "start execution" button.
If you are seeing this in workflow execution logs - then Dataform attempted to create the dataset automatically and failed. Either Dataform doesn't have permission to create new datasets, or possibly the dataset exists already but has been manually created in a different location.
Thank you for the clarification, Lewish!
Upon further investigation, I not only encountered the error in the validation panel after creating the workspace but also post executing the workflow.
When I check the details of the failed creation table, I noticed the following error :
'reason:"invalidQuery" location:"query" message:"Invalid value: Table \"s_table1\" must be qualified with a dataset (e.g. dataset.table). at [28:13]": invalid argument' .
The executed script:
"CREATE SCHEMA IF NOT EXISTS `projectid.staging_dev_env__s_table1` OPTIONS(location="eu");
It's worth noting that all our data objects - be it tables, views, models, etc. - are strictly set to be created in the EU Region.