DATAFLOW: Job failed with the Shuffle feature enab... - Page 2

davidregalado25 · 05-04-2023 03:47 PM

Hello everyone!

I have this DAG in dataflow. The part where it fails is in "group main_title" which is only a beam.GroupByKey()

As I understand, when doing GroupByKey operations, the runner will enable the shuffle service, which means it will take the data out of the VMs and send it to the Dataflow backend to apply the shuffle. Why is it failing?

Job info

Job type Batch

Job status Failed

SDK version Apache Beam Python 3.8 SDK 2.37.0

Job region us-central1

Worker location us-central1

Current workers 0

Latest worker statusWorker pool stopped.

Start timeMay 4, 2023 at 12:40:06 PM GMT-5

Elapsed time1 hr 7 min

Encryption typeGoogle-managed key

Dataflow Prime Disabled

Runner v2 Enabled

Dataflow Shuffle Enabled

Resource Metrics

Current vCPUs 1

Total vCPU time 12.38 vCPU hr

Current memory 3.75 GB

Total memory time 46.425 GB hr

Current HDD PD 25 GB

Total HDD PD time 309.5 GB hr

Current SSD PD 0 B

Total SSD PD time 0 GB hr

Total Shuffle data processed 312.22 GB

Billable Shuffle data processed 93.61 GB

Step info

Elements added 4,798,230

Estimated size 11.37 GB

--
Best regards
David Regalado
Web | Linkedin | Twitter

DATAFLOW: Job failed with the Shuffle feature enabled

Job info