Hi there,
I have a Jobs table (JOBS_BY_PROJECT from INFORMATION_SCHEMA) in my BigQuery. I want to differentiate and categorize these jobs based on the Dataform routine from which they were executed. I tried identifying it using the labels and job execution ID, but it doesn’t indicate the Dataform workflow.
How could I retrieve the workflow or version of a Dataform job?
Thanks,
Guilherme
Hi @GuilhermeKaneda,
Welcome to Google Cloud Community!
You're right, the BigQuery INFORMATION_SCHEMA.JOBS_BY_PROJECT
table doesn't directly expose the Dataform workflow or version name. You need to use Dataform's audit logs for that specific information. However, you can use the labels on the BigQuery jobs to connect them back to Dataform runs and actions:
INFORMATION_SCHEMA.JOBS_BY_PROJECT
by the dataform_project_id
label to isolate jobs created by your Dataform project.dataform_workflow_invocation_id
and dataform_workflow_execution_action_id
labels to group related BigQuery jobs. All jobs from the same Dataform run will share the same dataform_workflow_invocation_id.dataform_workflow_invocation_id
and, optionally, the dataform_workflow_execution_action_id
and look them up in your Dataform logs or UI. This is where you'll find the specific workflow name, version, and other relevant details like the run start time.I hope the above information is helpful.