Let's say that I have 10000 tasks in a job, is it right to say that setting parallelism to 1000 (the maximum allowed) is always the best solution? It will save time and the cost would be the same as setting parallelism to 1 since they would incur the same cpu hours. Could the staff members @Wen_gcp @wenyhu @bolianyin @Shamel make sure that my understanding is correct? Thanks!
Let's say that I have 10000 tasks in a job, is it right to say that setting parallelism to 1000 (the maximum allowed) is always the best solution i think no
Could some staff members help us clarify what would be the best practice?@Wen_gcp @wenyhu @bolianyin @Shamel Thanks!
@gradientopt You are correct that setting a higher parallelism will save time to finish the job and the cost should be similar to that of a lower parallelism value. The cost won't be exactly the same, the each VM has additional overhead. Batch by default uses a high parallelism value up to 1000 to get the result faster. However, it may not be best option for all jobs. If you prefer a different value, you should set the 'parallelism' in the API.