Hi, my use case is the following. I have three Batch jobs A,B,C to run. Each one of (A,B,C) has 1000 tasks. I would want to schedule a sequential execution of A,B,C, that is, the tasks within each one of (A,B,C) are still running in parallel but B only starts running after A fully completes.
I am currently using GCP Workflow to do this, however, Workflow has a max timeout of 1800 seconds, which is too restrictive. Can the staff suggest an alternative solution to this? Thanks very much!
An alternative could be to use Cloud Composer, which is a managed Apache Airflow for workflow management. You can find details on the Batch Airflow operator here.
Another alternative is to trigger Workflows from pub/sub and Batch job can post pubsub messages on job state changes. The downside is that it has more moving parts.
As a daily user of workflow, I'd like to have what @gradientopt suggested, increasing the timeout to a much longer range. We manage all our pipelines through the GCP's workflow. Several computation-intensive steps are executed through Batch and they're usually hours long. (Batch doesn't have an async API to allow us to submit the job and then poll the status. ).
It would be an important feature for workflow to allow long running operations and then on the user side we don't have to worry about adding more moving parts to just implement a "watcher" job for the long-running tasks
Same issue. Workflow + Batch is practically unusable for us with the timeout of 1800 seconds.
On a related note this is not highlighted on https://cloud.google.com/workflows/docs/tutorials/batch-and-workflows, so this limit was not discover util we implemented a prototype version.
Sharing this thread in which it is advised to file a feature request to the Workflows engineering team.
For those interested in increased Cloud Workflow timeouts, please DM me with the following details to help inform the Workflow team how to best advise.
In the Workflow call to googleapis.batch.v1.projects.locations.jobs.create add the timeout argument:
main:
params: [event]
steps:
- my_step:
call: googleapis.batch.v1.projects.locations.jobs.create
args:
parent: ${"projects/" + projectId + "/locations/" + region}
jobId: ${jobId}
body:
taskGroups:
taskCount: 1
parallelism: 1
taskSpec:
computeResource:
cpuMilli: 2000
memoryMib: 16384
runnables:
- script:
text: "echo hello"
allocationPolicy:
instances:
- policy:
provisioningModel: STANDARD
machineType: c2-standard-8
labels:
key_value: 'label1'
logsPolicy:
destination: CLOUD_LOGGING
connector_params:
timeout: 14400 #Set to 4 hours before workflow connector call to batch service timeouts.
result: my_step_response
For me this does not work since it just says that the maximum limit is 1800 seconds ... could the staff confirm that this would work? Thanks! @robertcarlos @Shamel @bolianyin
Yes, it did work for me after I included
connector_params:
timeout: 1209600
pooling_policy:
initial_delay: 60
max_delay: 3600
multiplier: 2
to each job definition
We do have similar issue. 1. Our system receives multiple files from a source then files are being stored to Cloud Storage. Cloud storage state change is triggered via Cloud PubSub which triggers Cloud Workflow. If there are multiple files are received before completing the execution of workflow, we would like workflow to not accept the trigger event. how can we achieve this?