Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Dataform: incremental table: stored procedure name

HI,

Dataform generates code with procedure name like:

 

EXECUTE IMMEDIATE
"""
CREATE OR REPLACE PROCEDURE `??????.df_d1b3851d423cca3c78ade6504a94655edc9a9d9e4b127897c0a14462705e8e9a`() OPTIONS(strict_mode=false)
BEGIN
 
Is it a way to change a name ? 
 
Thanks,
Cezary

 

Solved Solved
0 7 1,352
2 ACCEPTED SOLUTIONS

You're correct! That's an important nuance, and I apologize for not addressing that earlier. The procedure names in Dataform are generated based on a hash that includes factors like the execution start time, ensuring uniqueness and tracking changes to your data transformations. However, this means that procedures executed simultaneously can have identical names. This scenario is particularly problematic when these processes aim to merge data into the same destination table, increasing the risk of data loss or duplication.

Solutions and Best Practices

  1. Sequential Execution: To circumvent this issue, the most reliable method is to schedule your Dataform processes to run sequentially. This ensures that no two processes targeting the same table are executed at the same time.

  2. Dependency Management: In cases where sequential execution is impractical, meticulously structure your Dataform definitions to establish explicit dependencies between processes. This means:

    • Clearly define processes that consume data from the target table as dependents of any processes that concurrently write to it.
    • Rely on Dataform's built-in dependency resolution logic to enforce the correct execution order.
  3. Advanced Strategy: Temporary Tables and Merging: For complex scenarios, consider the following approach:

    • Designate a Primary Process: Identify one of the concurrent processes as the primary one.
    • Use Temporary Tables: Within the primary process, direct the output to a temporary table.
    • Merge Data: Perform a MERGE operation to consolidate the data from the temporary table with the target table in a controlled manner. Refer to the official BigQuery MERGE documentation for details: [invalid URL removed]

Important Considerations

  • Complexity: Implementing the advanced strategy with temporary tables increases the complexity of your Dataform project.
  • Trade-offs: Each solution has implications for execution time, efficiency, and maintainability. Weigh the trade-offs carefully based on your specific requirements and data volumes.

View solution in original post

7 REPLIES 7