Vertex AI automl-tabular SDK-Version Apache Beam P...

lennart073 · 04-06-2024 01:52 PM

Hi all,

I am new here and hope you can help me.

Just wanted to start using autoML and thought I "follow" along the base google tutorial for a tabular classification: https://www.youtube.com/watch?v=aNWCzyCK4Us

Created a dataset (tabular from csv), seems to be correct since all 16 columns and unique values are recognized.

Then I clicked on train a model and set all default values. Training started but failed after ~10 mins:

The DAG failed because some tasks failed. The failed tasks are: [exit-handler-1].; Job (project_id = x, job_id = y) is failed due to the above error.; Failed to handle the job: {project_number = x, job_id = y}

More detailed this node: tabular-stats-and-example-gen failed with this message: The replica workerpool0-0 exited with a non-zero status of 255. Termination reason: Error.

I also looked at the log, which states that there are too little ressouces but I cannot see near 100% at any component in the contingent-section: "Error: "Dataflow pipeline failed. State: FAILED, Error:\nWorkflow failed. Causes: Error:\n Message: Exceeded limit 'QUOTA_FOR_INSTANCES' on resource 'dataflow-tabular-stats-and-example-04061301-6fr0-harness'. Limit: 24.0\n HTTP Code: 403""

Looking at the dataflow section, I see that the node failed with this comment:

SDK-Version Apache Beam Python 3.8 SDK 2.50.0

recommended is, to upgrade to 2.55.0 (https://cloud.google.com/dataflow/docs/support/sdk-version-support-status?authuser=1&hl=de&_gl=1*19d...)- which I did via the cloud shell and I see that working.

Now, if I start a new run, I get the same behavior. So it seems like the pipeline from autoML Vertex doesn't use the set python wheel from dataflow...

Anybody has any idea? Would be very grateful, since I thought this would be a very easy start, and nothing works...

THANKS, Lennart

Vertex AI automl-tabular SDK-Version Apache Beam Python 3.8 SDK 2.50.0