Say you are trying to declare some data sources, but you have an ugly table name. E.g:
config {
type: "declaration",
database: "myDatabase",
schema: "dbo",
name: "main_partitioned_on_date_clustered_on_last_name",
}
Is there a way to alias this for when I call it as a ref?
For example, being able to call it as '${ref("dbo_main")}' or something. I have tried changing the file name but that did not work.
Solved! Go to Solution.
In Dataform, you can indeed create aliases for tables to make them easier to reference in your scripts. While the config
block is used to declare the table, you can assign an alias to this table that can be used throughout your Dataform scripts.
Here's how you can do it:
Declare the Table with a Config Block: You've already done this part. You declare your table with its full name in the config
block.
Assign an Alias: After declaring the table, you can assign an alias to it. This is done using the publish
function. The alias is the name you give to the publish
function.
Here's an example:
publish("short_name") {
description: "An alias for the long table name"
config {
type: "view"
database: "myDatabase"
schema: "dbo"
name: "main_partitioned_on_date_clustered_on_last_name"
}
query: `SELECT * FROM ${ref("main_partitioned_on_date_clustered_on_last_name")}`
}
In this example, "short_name"
is the alias for your long table name. Whenever you want to reference this table in other scripts, you can use ref("short_name")
instead of the full table name.
This approach makes your scripts cleaner and easier to read, especially when dealing with tables that have long or complex names.
In Dataform, you can indeed create aliases for tables to make them easier to reference in your scripts. While the config
block is used to declare the table, you can assign an alias to this table that can be used throughout your Dataform scripts.
Here's how you can do it:
Declare the Table with a Config Block: You've already done this part. You declare your table with its full name in the config
block.
Assign an Alias: After declaring the table, you can assign an alias to it. This is done using the publish
function. The alias is the name you give to the publish
function.
Here's an example:
publish("short_name") {
description: "An alias for the long table name"
config {
type: "view"
database: "myDatabase"
schema: "dbo"
name: "main_partitioned_on_date_clustered_on_last_name"
}
query: `SELECT * FROM ${ref("main_partitioned_on_date_clustered_on_last_name")}`
}
In this example, "short_name"
is the alias for your long table name. Whenever you want to reference this table in other scripts, you can use ref("short_name")
instead of the full table name.
This approach makes your scripts cleaner and easier to read, especially when dealing with tables that have long or complex names.
Hey @ms4446 , thanks for the prompt reply! Does this change how we are supposed to name the file? Should it be the long name or the alias? I have updated my ref to the alias (it is the prefix of the original table name) and tried the file name both ways, and still cannot compile.
Does this set the "short_name" globally? Is it callable from anywhere in the repository?
@ms4446 wrote:In this example, "short_name" is the alias for your long table name. Whenever you want to reference this table in other scripts, you can use ref("short_name") instead of the full table name.
In Dataform, when you define an alias using the publish
function, the alias is indeed available globally within your Dataform project. This means you can reference the alias from any other script in your repository.
So, if you define short_name
as an alias for your table with a long name, you can use ref("short_name")
in any other SQLX file within the same Dataform project to reference that table. This is particularly useful for maintaining readability and manageability in projects with multiple scripts or complex data models.
However, it's important to note a few things:
Scope: The alias is scoped to your Dataform project. It's not recognized outside of this specific Dataform environment.
Consistency: Ensure that the alias is unique within your project to avoid conflicts or confusion.
Refactoring: If you change the alias or the underlying table structure, you'll need to update all references to it across your project.
I declared a datasource in a separate sql file. And wrapped the config block in publish block. When I reference this data source using the publish name, it's not getting resolved. Also gives a "Missing dependency detected" error.
publish("latest_user"){
config {
type: "declaration",
database: "database-name",
schema: "prefix_"+dataform.projectConfig.vars.environment+"_bq",
name: "v3_latest_user",
}
query: `SELECT * FROM ${ref("v3_latest_user")}`
}
Errors on user.sqlx file, in which I referenced the above source as
Could not resolve "latest_user_attribute"
Missing dependency detected: Action "xxxx.yyyy.User" depends on "{"name":"latest_user"}" which does not exist
The issue you're facing is related to how Dataform resolves dependencies and references within the project. When you declare a data source in a publish block and then try to reference it, Dataform expects the reference to match the exact name of the published table or view.
Here are some steps to resolve this issue:
Ensure Correct Naming:
Check the Resolution Order:
Use declaration Type Correctly:
Dependency Issue:
Here’s a revised example that might work better:
publish("latest_user") { type: "view", // Change this from "declaration" to "view" or "table" if you want to create a new table or view schema: "prefix_"+dataform.projectConfig.vars.environment+"_bq", name: "v3_latest_user", query: `SELECT * FROM some_source_table` }
Then, when referencing it:
SELECT * FROM ${ref("latest_user")}
Ensure that the publish block is in the correct order and that the name matches exactly.
Thank you.
I'm declaring an existing table in the database. Not creating a new one for latest user. So, I'm good there. I've checked for typos and name matching as well.
I didn't understand about the order of publish block. I'm declaring these tables in their own sql files in a different folder from where I'm referencing them. Is that the problem here?
Declarations under sources folder. Referenced in sqlx under curated_tables folder
I'm trying to declare dependencies in the calling file. But having the error:
Could not resolve "latest_user"]
config {
type: "table",
dependencies: ["latest_user"]
}
Hi i have one doubt, isn't the declaration block less expensive? I would like to use aliases for my inputs but the solution involves changing them to views with a "SELECT * ..." inside It. I'm not going to make my pipeline more expensive this way?