I am running mostly kubernetes jobs, instead of constant deployments in my GKE autopilot cluster. As a result, most of the time, there are no pods running (besides the control plane) on my GKE autopilot cluster, and there is only 1 node available. But then often 4-5 jobs come it at once that start up pods. That usually results in not enough CPU resources being available, and hence the autopilot cluster spins up additional nodes. By itself that introduces latency that I am ok with. But often there is a succession of burst loads of kubernetes jobs all roughly the same time, so more and more nodes get spun up. Is it possible to somehow make sure the nodes that the GKE autopilot cluster starts up are larger, so they can accommodate more pods for all those jobs, instead of constantly starting up new nodes?
Regards,
Dolf.
By following these strategies, you can optimize your workload for bursty workloads with larger nodes in Autopilot.
Increase Requests: Set higher CPU and memory requests for your job pods to signal Autopilot to provision larger nodes.
Adjust Limits (Optional): Set appropriate limits to ensure fair resource allocation and prevent pods from overconsuming resources.
Lower Target: Decrease the targetUtilization field in the Autopilot configuration to favor larger nodes with more available resources.
Caution: This might lead to underutilized nodes during low-demand periods.
Apply Constraints: Use pod topology spread constraints to distribute pods across multiple nodes, reducing the likelihood of resource contention and frequent node scaling.
Consider Switch: If Autopilot's behavior doesn't fully align with your workload patterns, consider switching to a standard node pool with more granular control over node sizes and scaling.