Hello! I started using the batch job service and ran into a problem. I have a web server running to call and launch a job, each launched job is independent and is not related to each other in any way.
During startup, they are queued, thereby increasing the processing time significantly. Is it possible to run jobs in parallel?
I tried setting the parallelism parameter = 3 so that 3 jobs could be executed simultaneously. But setting this parameter did nothing.
Some background information:
instance: n1-standard-4, GPU: T4.
{
"taskGroups": [
{
"taskSpec": {
"runnables": [
{
"container": {
"imageUri": "some_value",
"entrypoint": "some_value",
"commands": ["some_value"]
}
}
],
"computeResource": {"some_values"},
},
"taskCount": 1,
"parallelism": 3,
}
],
"allocationPolicy": {
"instances": [
{
"installGpuDrivers": True,
"policy": {
"machineType": "n1-standard-4",
"provisioningModel": "STANDARD",
"accelerators": [
{
"type": "nvidia-tesla-t4",
"count": 1
}
]
}
}
]
},
"labels": {some_value}
}
Could this be because only 1 T4 is used?
Solved! Go to Solution.
The thread may be closed, I found the problem.
For those who will encounter this problem in the future, check your quotas, it is likely that some of them have reached the limit.
The thread may be closed, I found the problem.
For those who will encounter this problem in the future, check your quotas, it is likely that some of them have reached the limit.
Another thing you would want to confirm is that the TaskCount is set as intended. If you want 3 tasks to run in parallel the task count would have to indicate that 3 tasks exists. See the field descriptions in TaskGroup.