I want to update the autoscaling metric such that new replicas are triggered based on the volume of incoming prediction requests (incoming traffic), I got a documentation :AutoscalingMetricSpec But it doesn't show metricName parameter values that can be set to scale based on incoming requests, Please share the resources/documentations for this. Also, let me know what all things need to considered for Autoscaling.
Solved! Go to Solution.
Hello,
Thank you for contacting the Google Cloud Community.
To ensure a faster resolution and dedicated support for your issue, I kindly request you to file a support ticket by clicking here. Our support team will prioritize your request and provide you with the assistance you need.
For individual support issues, it is best to utilize the support ticketing system. We appreciate your cooperation!
Hello,
Thank you for contacting Google Cloud Community!
Here's the relevant documentation: AutoscalingMetricSpec
Regards,
Jai Ade
Hi, thanks for response,
But I think this doesn't mention about the autoscaling metrics in "Vertex online prediction endpoint". It is for group of VMs. Do we have any dedicated documentation for the autoscaling metrics on vertex prediction endpoint??
Hello,
Thank you for contacting the Google Cloud Community.
To ensure a faster resolution and dedicated support for your issue, I kindly request you to file a support ticket by clicking here. Our support team will prioritize your request and provide you with the assistance you need.
For individual support issues, it is best to utilize the support ticketing system. We appreciate your cooperation!
Hello,
I have the same question, what was the answer in the end?
Thanks a lot!
Leo