Hello everyone! I have a question about Dataform.
My question is: how does this environment handle errors in the pipeline?
Data Fusion has an error collector, which is a very useful way to manage and log errors in the pipeline.
So, does Dataform have something similar to an error collector?
thanks!!!!
Good afternoon.
Workflow errors are captured automatically and can be monitored within GCP Logging:
Or within the Workflow Execution Logs module in the Dataform platform.
Ok, but is this only for monitoring? Because in Data Fusion, the error collector lets you save errors along with customizable fields that describe the error in the pipeline. Is this the same?
The Workflow Execution Logs within Dataform is for monitoring and a quick overview.
GCP Logging however, is infinitely customizable. Search, filter, analyze, store,... Integrate it into pub/sub, Cloud Monitoring, Alerting,...
Alerting for example, I have it set up to notify stakeholders when a pipeline they're invested in fails or flags for QA issues. It sends information pulled out of the logs along with actionable items and resolution plans that I customize.
There is a learning curve so I would not expect someone to jump into it without a lot of time and research effort.
Hi @theos_E ,
Dataform currently does not have a built-in “error collector” like Cloud Data Fusion.
Instead, Dataform relies on:
– The underlying BigQuery job errors (you can view them in BigQuery or in the Dataform execution logs).
– Assertions you write in your Dataform code (like assert blocks) to explicitly catch and flag data issues.
– Execution status and logs visible in the Dataform UI or via integrated Cloud Logging (if you set it up).
There’s no automatic “side stream” or error table like in Data Fusion pipelines — you’d need to handle error patterns manually using custom SQL logic or assertions.