Would GKE fit in my use case - Unique Application ...

dheerajpanyam · 11-07-2024 01:56 AM

I am building a lightweight E-learning platform for students

Application has unique scaling needs. Below is my app characteristic

Students listen to courses
1 POD = 1 course
Most of time each student maps to a single course since it is early MVP so not like 100 students live on a single course
During US working hours there are 300 PODS at any given time
Initial CPU burst needed to boot up the PODS and heavy CPU usage. Once the PODS are online CPU consumption is less
30 students , 10 courses would mean 300 PODS

Is it possible in GKE to have instances warm so that they are ready to serve traffic . I know CR can do this.

nmagcalengjr

Welcome to Google Cloud Community!

According to this documentation, it explains how you can configure the maximum number of Pods that can run on a node for Standard clusters. GKE allows up to 110 Pods per node on Standard clusters, however Standard clusters can be set up to support up to 256 Pods per node. For Autopilot clusters, the maximum number of Pods per node is selected within a range of 8 to 256, based on the anticipated density of the workload.

There are also several strategies you can use in Google Kubernetes Engine (GKE). By referring to these resources, you can set up a GKE environment that ensures pods are "warm" and your workloads are prepared to handle traffic quickly when needed, especially when you are concerned about reducing latency from scaling events.

I hope the above information is helpful.

shannduin

If you just want to ensure that there are enough nodes running to handle your course Pods when they scale up, you can try creating placeholder Pods that use a low-priority PriorityClass. Those Pods will trigger node creation and will then stay running until they're evicted by higher-priority Pods (i.e. your course Pods).

Keep in mind that you'll incur charges for the nodes or for the Pod resource requests (depending on whether you use GKE Autopilot mode or GKE Standard mode). The tradeoff is that you get rapid scaling.

For the bursting bit, you can deploy your course Pods with the resource request being the average in-use CPU value and the resource limit being the extra boot capacity that the Pods need. In Autopilot mode, you pay for the requests.

I think you'd need to do some calculating to see what is good for you based on your budget and requirements + tolerance. For example, you might want your placeholder Pods to request enough capacity to allow your course Pods to burst as needed, but you might also want to think about how many students will log on at the same time (which will change how much "extra" capacity you would provision).

All that said, these documents might help you:

Set up placeholder Pods: https://cloud.google.com/kubernetes-engine/docs/how-to/capacity-provisioning
Configure Pod bursting: https://cloud.google.com/kubernetes-engine/docs/how-to/pod-bursting-gke

Would GKE fit in my use case - Unique Application scaling needs