Hi Team,
Right now I am trying to orchestrate single dataform pipeline from cloud workflows, but didn't found specific code to do so.
below is the generic code I found on dataform documentation.
main:
steps:
- init:
assign:
- repository: projects/PROJECT_ID/locations/REPOSITORY_LOCATION/repositories/REPOSITORY_ID
- createCompilationResult:
call: http.post
args:
url: ${"https://dataform.googleapis.com/v1beta1/" + repository + "/compilationResults"}
auth:
type: OAuth2
body:
gitCommitish: GIT_COMMITISH
result: compilationResult
- createWorkflowInvocation:
call: http.post
args:
url: ${"https://dataform.googleapis.com/v1beta1/" + repository + "/workflowInvocations"}
auth:
type: OAuth2
body:
compilationResult: ${compilationResult.body.name}
result: workflowInvocation
- complete:
return: ${workflowInvocation.body.name}
Can someone please elaborate what createWorkflowInvocation step is doing and what changes I need to made to code to invoke specific dataform pipeline.
The createWorkflowInvocation
step in your code is responsible for triggering the execution of the Dataform pipeline. It does so by sending a POST
request to the Dataform API at the /workflowInvocations
endpoint.
In your provided code, the body of the POST
request contains a single property, compilationResult
, which is set to the name of the compilationResult
created in the previous step createCompilationResult
. This compilationResult
is an identifier for the compiled state of your Dataform project, which includes all the SQL code, dependencies, and metadata that represent the current state of your project.
Here's a breakdown of what the createWorkflowInvocation
step does:
It calls the http.post
function to send a POST
request. The URL for this request is constructed from the repository
variable (which represents the path to your Dataform project on Google Cloud) and the /workflowInvocations
endpoint.
It uses OAuth2 for authentication.
It includes a body
in the POST
request, which contains the compilationResult
from the previous createCompilationResult
step.
It stores the result of the POST
request in the workflowInvocation
variable.
If you want to invoke a specific Dataform pipeline, you can do so by executing selected files conditionally using compilation variables1. To do this, you'll need to:
Create a compilation variable and add it to the selected tables or pipelines in your Dataform project.
Set the variable and its value in the codeCompilationConfig
block of your Dataform API compilation request:
codeCompilationConfig:
vars:
YOUR_VARIABLE: "VALUE"
Replace YOUR_VARIABLE
with the name of your variable, for example, executionSetting
, and VALUE
with the value of the variable for this compilation result.
This variable can then be used within your Dataform project to conditionally execute specific tables or pipelines based on its value.
Then, in the createWorkflowInvocation
step, you would pass the compilationResult
returned by the compilationResults.create
request, as you're already doing. This compilationResult
now represents the compiled state of your project with the specific pipeline you want to invoke
For more details refer to https://cloud.google.com/dataform/docs/compilation-overrides