Batch jobs - Get task retry attempt

Is there any way to get a job task retry attempt by reading an environment variable?

1 5 124
5 REPLIES 5

Hi @thiago-oliveira,

Batch supports task retry as https://cloud.google.com/batch/docs/automate-task-retries.

Is there any hard requirement that you must use environment variable?

Thanks!

I'm already using task retries as suggested. What I need is the current retry attempt. I couldn't find anything related to it on the documentation.

Usually things like this are made available to the job context through some env variable, like  the task index is made available through BATCH_TASK_INDEX.

Thanks!

Hi @thiago-oliveira,

Unfortunately, Batch hasn't supported default environment variables for task retry attempts. We support variables such as `BATCH_TASK_INDEX`, `BATCH_TASK_UID`, `BATCH_TASK_COUNT`.

How if you want to know which task retry attempts is running, you need to look at the task details with GetTask API, such as in gcloud: `gcloud batch tasks describe projects/271233643591/locations/us-central1/jobs/j24012200/taskGroups/group0/tasks/0`.

In the task details, if your task is retrying, you can tell which task retry attempt it is now, based on the latest events you get about "attempt". For example, if you see latest task event such as: `- description: Attempt 0 failed on zones/us-central1-f/instances/7353272055871263220`. This means the task attempt index 0 is already failed. Now it is in task attempt index 1, which is the second attempt. If the "attempt" related event follow with events such as `- description: Task state is updated from RUNNING to FAILED on zones/us-central1-f/instances/7353272055871263220`, that means no more retry will happen and the task is already completed as "FAILED".

If this does not help your situation, would you mind describing more about your request and requirement? We can consider supporting your request if that meets our priority and helps Batch usage.

Thanks!

Hey @wenyhu,

I ended up abstracting this logic to our domain, but I can describe my request. It might help others.

The maximum retry attempts allowed for a batch job is 10. Our business model requires more than that. So, in order to achieve this, I need to know the current retry attempt, just so at the last one, or at half way, I'm able to take some action.

Since batch already makes available some env vars related to the job context, like the task index, it would make sense to expose the current retry attempt the same way, hence my request to get the current retry attempt through env variables.

This is also how other cloud provider handles this situation, so people might be used to it.

Hope this helps!

Thanks!

Hi @thiago-oliveira,

Thanks for your valuable feedback!

Batch decides to prioritize this feature to add additional task retry attempt as an environment variable. I'll let you know once we have this delivered.

Thanks!