Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Executing specific/single dataform pipeline from cloud workflows

Hi Team,

Right now I am trying to orchestrate single dataform pipeline from cloud workflows, but didn't found specific code to do so.

below is the generic code I found on dataform documentation.

 

 

 

 

 

main:
    steps:
    - init:
        assign:
        - repository: projects/PROJECT_ID/locations/REPOSITORY_LOCATION/repositories/REPOSITORY_ID
    - createCompilationResult:
        call: http.post
        args:
            url: ${"https://dataform.googleapis.com/v1beta1/" + repository + "/compilationResults"}
            auth:
                type: OAuth2
            body:
                gitCommitish: GIT_COMMITISH
        result: compilationResult
    - createWorkflowInvocation:
        call: http.post
        args:
            url: ${"https://dataform.googleapis.com/v1beta1/" + repository + "/workflowInvocations"}
            auth:
                type: OAuth2
            body:
                compilationResult: ${compilationResult.body.name}
        result: workflowInvocation
    - complete:
        return: ${workflowInvocation.body.name}

 

 

 

 

 Can someone please elaborate what createWorkflowInvocation step is doing and what changes I need to made to code to invoke specific dataform pipeline.

0 1 925
1 REPLY 1

The createWorkflowInvocation step in your code is responsible for triggering the execution of the Dataform pipeline. It does so by sending a POST request to the Dataform API at the /workflowInvocations endpoint.

In your provided code, the body of the POST request contains a single property, compilationResult, which is set to the name of the compilationResult created in the previous step createCompilationResult. This compilationResult is an identifier for the compiled state of your Dataform project, which includes all the SQL code, dependencies, and metadata that represent the current state of your project.

Here's a breakdown of what the createWorkflowInvocation step does:

  1. It calls the http.post function to send a POST request. The URL for this request is constructed from the repository variable (which represents the path to your Dataform project on Google Cloud) and the /workflowInvocations endpoint.

  2. It uses OAuth2 for authentication.

  3. It includes a body in the POST request, which contains the compilationResult from the previous createCompilationResult step.

  4. It stores the result of the POST request in the workflowInvocation variable.

If you want to invoke a specific Dataform pipeline, you can do so by executing selected files conditionally using compilation variables​1​. To do this, you'll need to:

  1. Create a compilation variable and add it to the selected tables or pipelines in your Dataform project.

  2. Set the variable and its value in the codeCompilationConfig block of your Dataform API compilation request:

    yaml
    codeCompilationConfig: vars: YOUR_VARIABLE: "VALUE"

    Replace YOUR_VARIABLE with the name of your variable, for example, executionSetting, and VALUE with the value of the variable for this compilation result.

  3. This variable can then be used within your Dataform project to conditionally execute specific tables or pipelines based on its value.

Then, in the createWorkflowInvocation step, you would pass the compilationResult returned by the compilationResults.create request, as you're already doing. This compilationResult now represents the compiled state of your project with the specific pipeline you want to invoke

For more details refer to https://cloud.google.com/dataform/docs/compilation-overrides