Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

GCP Batch together with GCP workflow

Hi, my use case is the following. I have three Batch jobs  A,B,C to run. Each one of (A,B,C) has 1000 tasks. I would want to schedule a sequential execution of A,B,C, that is, the tasks within each one of (A,B,C) are still running in parallel but B only starts running after A fully completes.

I am currently using GCP Workflow to do this, however, Workflow has a max timeout of 1800 seconds, which is too restrictive. Can the staff suggest an alternative solution to this? Thanks very much!

@Wen_gcp  @wenyhu @bolianyin @Shamel   @Marramirez 

2 10 1,806
10 REPLIES 10

An alternative could be to use Cloud Composer, which is a managed Apache Airflow for workflow management. You can find details on the Batch Airflow operator here.

Another alternative is to trigger Workflows from pub/sub and Batch job can post pubsub messages on job state changes. The downside is that it has more moving parts.

As a daily user of workflow, I'd like to have what @gradientopt suggested, increasing the timeout to a much longer range. We manage all our pipelines through the GCP's workflow. Several computation-intensive steps are executed through Batch and they're usually hours long. (Batch doesn't have an async API to allow us to submit the job and then poll the status. ).

It would be an important feature for workflow to allow long running operations and then on the user side we don't have to worry about adding more moving parts to just implement a "watcher" job for the long-running tasks 

 

 

 

Same issue. Workflow + Batch is practically unusable for us with the timeout of 1800 seconds. 
On a related note this is not highlighted on https://cloud.google.com/workflows/docs/tutorials/batch-and-workflows, so this limit was not discover util we implemented a prototype version.

Sharing this thread in which it is advised to file a feature request to the Workflows engineering team.

For those interested in increased Cloud Workflow timeouts, please DM me with the following details to help inform the Workflow team how to best advise.

  • Overall objectives and use case
  • Any system constraints
  • expected performance/latency requirements
  • traffic volumes and regions
  • concurrent operations (if any)
  • up and downstream APIs being used

In the Workflow call to googleapis.batch.v1.projects.locations.jobs.create add the timeout argument:

 

 

 

 

 

 

 

 

 

 

main:
  params: [event]
  steps:
      - my_step:
            call: googleapis.batch.v1.projects.locations.jobs.create
            args:
                parent: ${"projects/" + projectId + "/locations/" + region}
                jobId: ${jobId}      
                body:
                  taskGroups:
                    taskCount: 1
                    parallelism: 1
                    taskSpec:
                      computeResource:
                        cpuMilli: 2000
                        memoryMib: 16384
                      runnables:
                        - script:
                            text: "echo hello"
                  allocationPolicy:
                      instances:
                        - policy:
                            provisioningModel: STANDARD
                            machineType: c2-standard-8
                  labels:
                    key_value: 'label1' 
                  logsPolicy:
                    destination: CLOUD_LOGGING
                connector_params: 
                    timeout: 14400 #Set to 4 hours before workflow connector call to batch service timeouts.     
            result: my_step_response

For me this does not work since it just says that the maximum limit is 1800 seconds ... could the staff confirm that this would work? Thanks! @robertcarlos  @Shamel @bolianyin 

Yes, it did work for me after I included 

        connector_params:
          timeout: 1209600
          pooling_policy:
            initial_delay: 60
            max_delay: 3600
            multiplier: 2

to each job definition  

We do have similar issue.  1. Our system receives multiple files from a source then files are being stored to Cloud Storage. Cloud storage state change is triggered via Cloud PubSub which triggers Cloud Workflow. If there are multiple files are received before completing the execution of workflow, we would like workflow to not accept the trigger event. how can we achieve this?