Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

DATAFLOW: teardown method doesn't execute. Please help.

Hello!

I'm working with Google Dataflow in Apache Beam Python 3.8 SDK 2.37.0. The issue I'm facing is that one of my DoFns is partially executing. Or at least, it seems that way since, according to the lifecycle of DoFns, they go through the following states:

  1. __init__
  2. setup
  3. process
  4. start_bundle
  5. finish_bundle
  6. teardown

Out of these methods, only __init__, setup, and process are being executed. Any idea?

More context: I'm processing only 1 document from MongoDB and each of the previous steps executes successfully. I can even see my log.info for the process method in the logs. But not for the teardown where my insert to Elasticsearch takes place..

This is my DAG:

Screen Shot 2023-05-29 at 21.19.13.png

 I've tested with the following machine types:

n1-standard-1: a machine too slow so I can see the results before the teardown method finishes. Or at least, that was what I thought.
n1-highcpu-96: was it a machine too fast for 1 document?

UPDATE: I tried the same on COLAB and it worked!

Screen Shot 2023-05-29 at 22.22.23.png

--
Best regards
David Regalado
Web | Linkedin | Twitter

0 1 866
1 REPLY 1