Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Support for multiple "tasksSpec" within "taskGroups"

Hi,

I am using Batch from a little while now. I came across a use case that I would have expect to be covered looking at the API specs but it seems not supported now.

I want to run in the same batch job 2 different tasks with different configuration. I created a Batch job where I have a container (a single runnable in a "taskSpec" definition) running in parallel multiple times and I have another container that needs to be defined with different specs see example below (note that I use Workflow to create a batch job so the example config is in yaml). 

 

 

taskGroups:
# Call script 1
- taskSpec:
    runnables:
    - container:
        imageUri: ${"europe-west1-docker.pkg.dev/" + sys.get_env("GOOGLE_CLOUD_PROJECT_ID") + "/my_repo/my_image:latest"}
        entrypoint: "/entrypoint_1.sh"
    computeResource:
      cpuMilli: 2000
      memoryMib: 1024
    maxRetryCount: 1
    maxRunDuration: 1800s
  taskCount: 31
  parallelism: 31
# Call script 2
- taskSpec:
    runnables:
    - container:
        imageUri: ${"europe-west1-docker.pkg.dev/" + sys.get_env("GOOGLE_CLOUD_PROJECT_ID") + "/my_repo/my_image:latest"}
        entrypoint: "/entrypoint_2.sh"
    computeResource:
      cpuMilli: 2000
      memoryMib: 1024
    maxRetryCount: 1
    maxRunDuration: 300s
  taskCount: 1
  parallelism: 1
allocationPolicy:
  instances:
  - policy:
      machineType: e2-standard-2
  network:
    network_interfaces:
      network: ${"projects/" + sys.get_env("GOOGLE_CLOUD_PROJECT_ID") + "/global/networks/my_network"}
      no_external_ip_address: false
labels:
  env: prod
logsPolicy:
  destination: CLOUD_LOGGING

 

When I try to create this batch job I get this error:  "task_groups field is invalid. Only one task group per job is supported error : <nil>".

If I run only one or the other the Batch job complete successfully with no error.

Any plan to support this?

Thank you very much

 

Best Regards

0 6 914
6 REPLIES 6

Hello,

Thanks for using Batch and providing us the feedback!

We are evaluating a list of new features and multiple taskGroup is one of them. To help us prioritize, do you mind sharing more about your requirements about multiple taskGroups per job?  In particular, how do you compare that with two separate jobs each containing one taskGroup.  

Hi,

My use case is that we have a container dedicated to perform a certain set of actions that some time can be bundled in a single script (entrypoint_1.sh) and use

BATCH_TASK_INDEX to pick up different part of the work at runtime. Some action are quite different and, although it is possible to write a more complex script that is able to handle that case, it is fragile and problematic. Thus, the solution of having a different script (entrypoint_2.sh) to handle that case. You can see that they use different parallelism but they are completely independent from each other. At the moment we use 2 batch job because it is the only solution but it would make much more sense to be able to have the together logically. This would be a more efficient solution because there is no need to create a new instance group and new VMs.
Moreover, I would propose the ability to cap the max number of instances per batch job so that even with multiple taskGroup you can easily control the total job resources.
 
Happy to provide more details if necessary

Any progress on it?

Unfortunately, we don't have a timeline for this feature yet.

Is Google Workflows backed by an Opensource product?

@bolianyin I would be really happy to implement this feature if you open the code. 😀

@angelcervera Appreciate your willingness to help, but Google Batch is backed by an open source project.