Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Dataform Random Action Failure

Issue: I'm getting an execution failure near the end of my scheduled dataform workflow from time to time and I can't find any information on the Error message I'm getting. 

Error: "Request '/JobService.GetJob' from role 'cloud-dataform-execution-runner' throttled: Service is overloaded (exponential-moving-average-task-load) go/tr-o."

Background Info: 

  • My workflow has around 70 actions in it and typically takes about 1 minute to run. It is scheduled to run every hour. 
  • The execution is typically successful and the failures seem to happen on random days at random times. Last month, it was failing about once a day for about two weeks. Thankfully, the failures have become less frequent but are still happening. 
  • The specific action(s) that it fails on seems to be random but is normally in the last 10 or so actions of the workflow.
  • When I click the job id on the action that failed and look at the details in BigQuery, the actual job seems to have succeeded, so I'm not sure why the action is failing in Dataform.
  • When I notice it fails, I'm able to retry the execution immediately and it succeeds with no issues. 

I'd really love to understand the error message so I can try to prevent it from happening but I also plan to try to implement some type of retry effort as well as set up failure notifications.

1 2 313
2 REPLIES 2

JVL
Bronze 3
Bronze 3

@mary_advocado I just started receiving this error as well - did you ever get any insight into the issue?

Unfortunately, no. But on the bright side, it did eventually stop happening. I haven't seen the error since shortly after I posted my original question. Sorry I don't have more helpful info for you!