Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Scheduling runs in Dataform / BigQuery

Hello,

Is there a way to schedule runs directly from Dataform with an environments.json file as described in the doc here ? Or do I need to use other Google Cloud services like Scheduler/Workflows or Composer ?

Thanks in advance for your reply.

 

Solved Solved
0 10 4,127
1 ACCEPTED SOLUTION

schedules.json and environments.json are only read by "legacy" Dataform - Dataform on GCP does not understand them.

You are correct in thinking that currently the only way to schedule Dataform workflows on GCP is with Workflows/Composer or similar - https://cloud.google.com/dataform/docs/schedule-executions-workflows

Native scheduling as part of Dataform on GCP is in the pipeline and will be available before GA.

View solution in original post

10 REPLIES 10

RC1
Bronze 4
Bronze 4

Thank you for your reply.

I added a schedules.json file in my Dataform workspace as you can see in the screenshot below, but nothing happens i.e. no changes to tables/views in BigQuery's SQL Workspace and no logging in Dataform.  I had no results either with or without pushing changes to remote.

Capture.PNG

Before that, I also tried to add an environments.json file (that includes a schedules section) with no results.

Any ideas on why it doesn't work ?

Thank you. 

schedules.json and environments.json are only read by "legacy" Dataform - Dataform on GCP does not understand them.

You are correct in thinking that currently the only way to schedule Dataform workflows on GCP is with Workflows/Composer or similar - https://cloud.google.com/dataform/docs/schedule-executions-workflows

Native scheduling as part of Dataform on GCP is in the pipeline and will be available before GA.

Thank you for your answer! 
Dataform's a great service by the way, keep up the good work !

Hello can you please guide me how to do that?

I'm not sure at all how to obtain project id, repository location and name. Because i'm receiving a 403 on my service account when executing.

Hello,
did you get a result or no ? because i have the same problem ?

Hey @Houssemabd and @robertoctorresf 

I don't know if you manage to resolve the issue or not. Anyhow for anyone else who might come here in the future.

Adding a role to the workflow service account solved a problem for me.
The needed roles are

  • dataform.compilationResults.create
  • dataform.workflowInvocations.create.

Or simply Dataform Service Agent.

docs for compilationResults and workflowInvocations endpoints
it would be nice that these role requirements were mentioned in dataform docs.

Hope this helps.

Hi lewish,

I am trying to unpack the pricing structure for Composer & Workflows tied to Dataform scheduling. Could you help me understand what is an "internal step" and an "external step" in this context? Is every executed SQLX a separate step? Every time it executes (or fails)? 

Thanks!

B.

He Lewis, is native scheduling now available? Also how do I set it up?

NVM found it under workflow configurations. Dataflow is really great