I'm preparing to launch a prod endpoint using vertex ai automl text solution. I want to understand more prelaunch how many requests/second my endpoint will handle.
I see that https://cloud.google.com/vertex-ai/docs/quotas this implies I can run 30,000/minute. Is that the limit or will things break down before then? Looking for some best practices.