HI,
Dataform generates code with procedure name like:
Solved! Go to Solution.
You're correct! That's an important nuance, and I apologize for not addressing that earlier. The procedure names in Dataform are generated based on a hash that includes factors like the execution start time, ensuring uniqueness and tracking changes to your data transformations. However, this means that procedures executed simultaneously can have identical names. This scenario is particularly problematic when these processes aim to merge data into the same destination table, increasing the risk of data loss or duplication.
Solutions and Best Practices
Sequential Execution: To circumvent this issue, the most reliable method is to schedule your Dataform processes to run sequentially. This ensures that no two processes targeting the same table are executed at the same time.
Dependency Management: In cases where sequential execution is impractical, meticulously structure your Dataform definitions to establish explicit dependencies between processes. This means:
Advanced Strategy: Temporary Tables and Merging: For complex scenarios, consider the following approach:
MERGE
operation to consolidate the data from the temporary table with the target table in a controlled manner. Refer to the official BigQuery MERGE
documentation for details: [invalid URL removed]Important Considerations
ok thank you!