We have a fairly large ETL with about 200 jobs that is orchestrated through dataform to Bigquery.
We are using EU only locations in BQ.
On monday 17th Oct 2022 our ETL is running as normal.
Yesterday tuesday morning 7am CET, and today still, our ETL can't finish. A job that normally takes 5 min 30 sec now took 40 minutes.
In addition to that many of our biggest finals jobs doesn't even manage to finish at all.
We've checked that we haven't had any code updates, table property updates nor change in underlying data.
This seems to be a severe performance issue in BQ EU?! Or some kind of update to BQ that renders our current jobs broken.
Smaller jobs seems to be doing ok (haven't measured change in time, but no extreme changes at least).
Google please help!
Solved! Go to Solution.
I actually believe this incident https://status.cloud.google.com/incidents/4dy8veZsrCWE6SNhbabD#ZSn1o5G1NLU94mcPpJH3 was the cause.
It turned out to be a pure issues on BigQuery side of things.
Smaller jobs seemed to work a lot better.
Now everything is performing as normal again.
There were no logs of error in dataform, just timed out because the jobs didn't finish.
@samueltroilius Can you plz check the logs of Dataform and let us know what exact error is ?
I have seen performance issues where my job which usually takes 1.5-2 mins took 15 mins that day [Ref]
You can go for Big Query slots [ https://cloud.google.com/bigquery/docs/slots ] , which are kind of virtual instances reserved for your workloads.
I actually believe this incident https://status.cloud.google.com/incidents/4dy8veZsrCWE6SNhbabD#ZSn1o5G1NLU94mcPpJH3 was the cause.
It turned out to be a pure issues on BigQuery side of things.
Smaller jobs seemed to work a lot better.
Now everything is performing as normal again.
There were no logs of error in dataform, just timed out because the jobs didn't finish.