I'm planning to use Google Cloud Platform's Vertex AI for a few projects. So, I was looking through the documentation in the section on rate limits and I came across this:
https://cloud.google.com/vertex-ai/generative-ai/docs/quotas
But I haven't found any information anywhere about the algorithm that sets these limits. That is, I have two scenarios in my mind:
Or maybe it's different from the scenarios outlined.
I would appreciate if someone could explain to me how Google calculates it, or if there is a section of the documentation where I can find this since I haven't seen it.