how to disable concurrency execution of batch job

I'm currently using a GCP workflow for batch job processing, set to run every 30 minutes. However, I'm facing an issue where multiple jobs are sometimes created while old ones are still executing. How can I disable concurrency to ensure that only one job runs at a time? I want to avoid situations where new jobs are created while a job is still running. Any guidance on how to achieve this within the GCP workflow setup would be greatly appreciated.


 - createAndRunBatchJob:
                call: http.post
                args:
                  url: ##{batchApiUrl}
                  query:
                    job_id: ##{jobId}
                  headers:
                    Content-Type: application/json
                  auth:
                    type: OAuth2
                  body:
                    taskGroups:
                      taskSpec:
                        runnables:
                          - container:
                              imageUri: ##{imageUri}
                              commands:
                                - "--script-location"
                                - "/mnt/disks/batch-scripts-${project_sha}/google_tv_aep/batch/code"
                            environment:
                              variables:
                                job_id: ##{jobId}
                                secret: projects/${project}/secrets/${secret_nm}/versions/latest
                                local_path_1: /mnt/disks/${local_path_1}
                                local_path_2: /mnt/disks/${local_path_2}
                                query_file: "tv_mkt_campaign_events_load_query.sql"
                                output_dataset_name: "tv_mkt_campaign_events_incremental_data.json"
                                project: "${project}"
                                aep_sandbox: "${aep_sandbox}"
                                aep_dataset_name: "TV mkt campaign Events"
                                load_mode: "incremental"
                                aep_connection_id: "${aep_connection_id}"
                                checkpoint_file: "mkt_campaign_checkpoint.param"
                                aep_flow_id: "${aep_flow_id}"
                                checkpoint_field: timestamp
                        volumes:
                          - mountPath: /mnt/disks/batch-scripts-${project_sha}
                            gcs:
                              remotePath: batch-scripts-${project_sha}
                          - mountPath: /mnt/disks/${bucket_nm}
                            gcs:
                              remotePath: ${bucket_nm}
                        computeResource:
                          cpuMilli: 2000
                          memoryMib: 16384
                      taskCount: 1
                      parallelism: 2
                    allocationPolicy:
                      network:
                        networkInterfaces:
                          network: projects/${project}/global/networks/spark-network
                          subnetwork: projects/${project}/regions/${location}/subnetworks/spark-subnet-pr
                          noExternalIpAddress: true
                      serviceAccount:
                        email: ${builder}
                    logsPolicy:
                      destination: CLOUD_LOGGING
                result: createAndRunBatchJobResponse

Solved Solved
4 4 142
1 ACCEPTED SOLUTION

Hi @sugesh,

Below is one example of submitting a new Job when all listed existing Jobs are completed.

wenyhu_0-1713854942524.png

Ref: 

https://cloud.google.com/batch/docs/reference/rest/v1/projects.locations.jobs#State

- https://cloud.google.com/workflows/docs/reference/syntax/conditions#yaml.

Hope this helps,

Wenyan

View solution in original post

4 REPLIES 4

Hi @sugesh,

If you want tasks for a job to be executed in sequential order, you can use the `IN_ORDER` field in API's Scheduling Policy: 

 
If you want jobs to be executed not concurrently, Batch is now working on the feature support on Batch API side. We'll let you know once its ready.
 
Before that, you can write on condition check on your side in the Workflow source to check the existing Job status, and decide whether submitting new Job or not based on the former submitted Jobs status.
 
Thanks!
Wenyan

@wenyhu  Thanks for the reply. While I wait for the feature support on Batch API. I will take your suggestion on setting a conditional check in the workflow; would you have any sample that I can refer to?

Hi @sugesh,

Below is one example of submitting a new Job when all listed existing Jobs are completed.

wenyhu_0-1713854942524.png

Ref: 

https://cloud.google.com/batch/docs/reference/rest/v1/projects.locations.jobs#State

- https://cloud.google.com/workflows/docs/reference/syntax/conditions#yaml.

Hope this helps,

Wenyan

@wenyhu Thanks much!